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(54) Title: LIGANDS FOR PHOSPHATASE BINDING ASSAY 
(57) Abstract 

Disclosed are new ligands for use in a binding assay for proteases and phosphatases, which contain cysteine in their binding sites 
or as a necessary structural component for enzymatic binding. The sulfhydryl group of cysteine is the nuclcophilic group in the enzyme's 
mechanistic proteolytic and hydroiytic properties. The assay can be used to determine the ability of new, unknown ligands and mixtures of 
compounds to competitively bind with the enzyme versus a known binding agent for the enzyme, e.g., a known enzyme inhibitor By the 
use of a mutant form of the natural or native wild-type enzyme, in which serine, or another amino acid, e.g., alanine, replaces cysteine the 
problem of interference from extraneous oxidizing and alkylating agents in the assay procedure is overcome. The interference arises because 
of oxidation or a kylat.on of the sulfhydryl, -SH (or -S~), in the cysteine, which then adversely affects the binding ability of the enzyme 
Speafically disclosed is an assay for tyrosine phosphatases and cysteine proteases, including capsases and cathepsin S ; e.g., Cat hep sin 
k(0.) utilizing scintillation pioximity assay (SPA) technology. The assay has important applications in the discovery of compounds for 
the treatment and study of, for example, diabetes, immunosuppression, cancer, Alzheimer's disease and osteoporosis The novel feature 
of the use of a mutant enzyme can be extended to its use in a wide variety of conventional colorimetric, photometric, spectrophotometry 
radioimmunoassay and ligand-bmdmg competitive assays. K ' 
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TITLE OF THE INVENTION 

LIUANDS FOR PHOSPHATASE BINDJNC ASSAY 
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FIELD OF THE INVENTION 

This invention relates to the use of mutant phosphatase 
and protease enzymes in a competitive binding assay. Specific 
10 examples are the enzymes, tyrosine phosphatase and cysteine 

protease, e.g. Cathepsin K, and the assay specifically described is a 
scintillation proximity assay using a radioactive inhibitor to induce 
scintillation. 

BACKGROUND OF THE INVENTION 

15 The use of the scintillation proximity assay (SPA) to 

study enzyme binding and interactions is a new type of 
radioimmunoassay and is well known in the art. The advantage of 
SPA technology over more conventional radioimmunoassay or 
ligand-binding assays, is that it eliminates the need to separate 

20 unbound ligand from bound ligand prior to ligand measurement. See 
for example, Nature, Vol, 341, pp. 167-178 entitled "Scintillation 
Proximity Assay " by N. Bosworth and P. Towers, Anal. Biochem. 
Vol. 217, pp. 139-147 (1994) entitled "Biotinylated and Cysteine- 
Moclifie^ Peptides as Useful Reagents For Studying the Inhibition of 

J 5 Cathepsin G" by A.M. Brown, et al., Anal. Biochem. Vol. 223, pp. 259- 
265 (1994) entitled "Direct Measurement of the Binding of RAS to 
Neurofibromin Using Scintillation Proximity Assay" by R. H. 
Skinner et al. and Anal Biochem. Vol. 230, pp. 101-107(1995) entitled 
"Scintillation Proximity Assay to Measure Binding of Soluble 
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Fibronectin to Antibody-Captured alphas^i Integrin" by J. A. 
Pachter et at. 

The basic principle of the assay lies in the use of a solid 
support containing a scintillation agent, wherein a target enzyme is 
5 attached to the support through, e.g., a second enzyme-antienz3 r me 
linkage. A known tritiated or l!25 iodinated binding agent, i.e., 
radioligand inhibitor ligand for the target enzyme is utilized as a 
control, which when bound to the active site in the target enzyme, is 
in close proximity to the scintillation agent to induce a scintillation 

10 signal, e.g., photon emission, which can be measured by 

conventional scintillation/radiographic techniques. The unbound 
tritiated (hot) ligand is too far removed from the scintillation agent to 
cause an interfering measurable scintillation signal and therefore 
does not need to be separated, e.g., filtration, as in conventional 

15 ligand-binding assays. 

The binding of an unknown or potential new ligand 
(cold, being non-radioactive) can then be determined in a competitive 
assay versus the known radioligand, by measuring the resulting 
change in the scintillation signal which will significantly decrease 

20 when the unknown ligand also possesses good binding properties. 

However, a problem arises when utilizing a target 
enzyme containing a cysteine group, having a free thiol linkage, - 
SH,(or present as -S" ) which is in the active site region or is closely 
associated with the active site and is important for enzyme-ligand 

25 binding. If the unknown ligand or mixture, e.g. natural product 
extracts, human body fluids, cellular fluids, etc. contain reagents 
which can alkylate, oxidize or chemically interfere with the cysteine 
thiol group such that normal enzyme-ligand binding is disrupted, 
then false readings will occur in the assay. 

30 What is needed in the art is a method to circumvent and 

avoid the problem of cysteine interference in the scintillation 
proximity assay (SPA) procedure in enzyme binding studies. 
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SUMMARY OF THE INVENTION 

We have discovered that by substituting serine for 
cysteine in a target enzyme, where the cysteine plays an active role in 
the wild-type enzyme-natural ligand binding process, usually as the 
5 catalytic nucleophile in the active binding site, a mutant is formed 
which can be successfully employed in a scintillation proximity assay 
without any active site cysteine interference. 

This discovery can be utilized for any enzyme which 
contains cysteine groups important or essential for binding and/or 
10 catalytic activity as proteases or hydrolases and includes 

phosphatases, e.g., tyrosine phosphatases and proteases, e.g. 
cysteine proteases, including the cathepsins, i.e., Cathepsin K (02) 
and the capsases. 

Further, use of the mutant enzyme is not limited to the 
15 scintillation proximity assay, but can be used in a wide variety of 
known assays including colorimetric, spectrophotometric, ligand- 
binding assays, radioimmunoassays and the like. 

We have furthermore discovered a new method of 
amplifying the effect of a binding agent ligand, e.g., radioactive 
20 inhibitor, useful in the assay by replacing two or more 

phosphotyrosine residues with 4-phosphono(difluoromethyl) 
phenylalanine (F2Pmp) moieties. The resulting inhibitor exhibits a 

greater and more hydrolytically stable binding affinity for the target 
enzyme and a stronger scintillation signal. 
25 By this invention there is provided a process for 

determining the binding ability of a ligand to a cysteine-containing 
wild-type enzyme comprising the steps of: 

(a) contacting a complex with the ligand, the complex 
comprising a mutant form of the wild-type enzyme, 
30 in which cysteine, at the active site, is replaced 

with serine, in the presence of a known binding 
agent for the mutant enzyme, wherein the binding 
agent is capable of binding with the mutant 
enzyme to produce a measurable signal. 

35 
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Further provided is a process for determining the 
hinding ability of a ligand, preferably a non-radioactive (cold ) ligand, 
to an active site cysteine-containing wild-type tyrosine phosphatase 
comprising the steps of: 
5 (a) contacting a complex with the ligand, the complex 

comprising a mutant form of the wild-type enzyme, 
the mutant enzyme being PTP1B, containing the 
same amino acid sequence 1-320 as the wild type 
enzyme,except at position 215, in which cysteine is 
10 replaced with serine in the mutant enzyme, in the 

presence of a known radioligand binding agent for 
the mutant enzyme, wherein the binding agent is 
capable of binding with the mutant enzyme to 
produce a measurable beta radiation-induced 
15 scintillation signal. 



Also provided is a new class of peptide binding agents selected 
from the group consisting of: 

Nf-Benzoyl-L-glutamyl-[4-phosphono(difluoromethyl)]-L-phenylalanyl-[4- 
20 phosphono(difluoromethyl)]-L-phenylalanineamide (BzN-EJJ-CONll2), where 

E is glutamic acid and J is 4-phosphono(difluoro-methyl)J-L-phenylalanyl; 

N-Benzoyl-L-glutamyl-[4-phosphono(difluoromethyl)]-L-phenylalanyl-[4- 

phosphono(difluoromethyl )]-L-phenylalanine amide; 

N-Acetyl-L-glutamyl-[4-phosphono(difluoromethyl)]-L-phenylalanyl-[4- 
25 phosphono(difluoromethyl)]-L-phenylalanine amide; 

L-Glutamyl[4~phosphono(difluoromethyl)]-L-phenylalanyl-[4-phosphono- 

(difluoromethy!)]-L-phenylalanine amide; 

L-Lysinyl-[4-phosphono(difluoromcthyl)]-L-phenylalanyl-[4-phosphono- 
(difluoromethyl)]-L-phenylalanine amide; 
30 L-Serinyl-[4-phosphono(difluoromethyl)J-L-phenylalanyl-[4-phosphono- 
(difluoromcthyl)]-L-phenylalanine amide; 

L-Prolinyl-[4-phosphono(difluoromethyl)]-L-phenylalanyl-[4-phosphono- 
(difluoromethyl)]-L-phenylalanine amide; and 

L-Isoleucinyl-[4-phosphono(difluoromethyl)J-L-phenyIalanyl-[4-phosphono- 
35 (difluoromethyl)]-L-phenylalaninc amide; mid their tritiated and 1^5 iodinatcd 
derivatives. 
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Further provided is a novel tritiated peptide, tritiated 
BzN-EJJ-CONH2, being N-(3,5-Ditritio)benzoyl-L-glutamyl-[4- 
phosphono(difluoromethyl )]-L-phenylalanyl-[4-phosphono- 
(difluoromethyl)]-L-phenylalanineamidc, wherein E as used herein 
5 is glutamic acid and J, as used herein, is the (F2Pmp) moiety, 

(4-phosphono(difluoromethyl)-phenylalanyl). 

Furthermore there is provided a process for increasing 
the binding affinity of a ligand for a tyrosine phosphatase or cysteine 
protease comprising introducing into the ligand two or more 4- 
10 phosphono(difluoromethyl)-phenylalanine groups; also provided is 
the resulting disubstituted ligand. 

In addition there is provided a complex comprised of: 

(a) a mutant form of a wild-type enzyme, in which 
cysteine, necessary for activity in the active site, is 

15 replaced with serine and is attached to: 

(b) a solid support. 

BRIEF DESCRIPTION OF THE DRAWINGS 

FIGURE 1 illustrates the main elements of the invention 

20 including the scintillation agent 1, the supporting (fluomicrosphere) 
bead & the surface binding Protein A 10, the linking anti-GST 
enzyme 15, the fused enzyme construct 20, the GST enzyme 25, the 
mutant enzyme 30, the tritiated peptide inhibitor 35, the beta 
radiation emission 40 from the radioactive peptide inhibitor 35 and 

25 the emitted light 45 from the induced scintillation. 

FIGURE 2 (A and B) illustrates the DNA and amino acid 
sequences for PTP1B tyrosine phosphatase enzyme, truncated to 
amino acid positions 1-320. (Active site cysteine at position 215 is in 
30 bold and underlined). 



FIGURE 3 (A, B and C) illustrates the DNA and amino 
acid sequences for Cathepsin K. The upper nucleotide sequence 
represents the cathepsin K cDNA sequence which encodes the 
35 cathepsin K preproenzyme (indicated by the corresponding three 

letter amino acid codes). Numbering indicates the cDNA nucleotide 
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position. The underlined amino acid is the active site Cys 1,n residue 
that was mutated to either Ser or Ala. 

FIGURE 4 (A and B) illustrates the DNA and amino acid 

5 sequences for the capsase, apopain. The upper nucleotide sequence 

represents the apopain (CPP32) cDNA sequence which encodes the 

apopain proenzyme (indicated by the corresponding three letter 

amino acid codes). Numbering indicates the cDNA nucleotide 

163 

position. The underlined amino acid is the active site Cys residue 
10 that was mutated to Ser. 

DETAILED DESCRIPTION OF THE INVENTION 

The theory underlying the main embodiment of the 
invention can be readily seen and understood by reference to 
15 FIGURE 1. 

Scintillation agent 1 is incorporated into small (yttrium 
silicate or PVT fluomicro-spheres, AMERSHAM) beads £that 
contain on their surface immunosorbent protein A 10. The protein A 
coated bead £) binds the GST fused enzyme construct 20, containing 

20 GST enzyme 25 and PTP1B mutant enzyme 30, via anti-GST enzyme 
antibody 15. When the radioactive e.g., tritiated, peptide 35 is bound 
to the mutant phosphatase enzyme 30, it is in close enough proximity 
to the bead 5 for its beta emission 40 (or Auger electron emission in 
the case of 1125) to stimulate the scintillation agent 1 to emit light 

25 (photon emission) 45. This light 45 is measured as counts in a beta 
plate counter. When the tritiated peptide 35 is unbound it is too 
distant from the scintillation agent 1 and the energy is dissipated 
before reaching the bead 5, resulting in low measured counts. Non- 
radioactive ligands which compete with the tritiated peptide 35 for the 

30 same binding site on the mutant phosphatase enzyme 30 will remove 
and/or replace the tritiated peptide 35 from the mutant enzyme 30 
resulting in lower counts from the uncompeted peptide control. By 
varying the concentration of the unknown ligand and measuring the 
resulting lower counts, the inhibition at 50%(IC50) for ligand binding 

35 to the mutant enzyme 30 can be obtained. This then is a measure of 
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the binding ability of the ligand to the mutant enzyme and the wild- 
type enzyme. 

The term "complex" as used herein refers to the 
assembly containing the mutant enzyme. In its simplest 
5 embodiment, the complex is a solid support with the mutant enzyme 
attached to the surface of the support. A linker can also be employed. 
As illustrated in FIGURE 1, the complex can further comprise a bead 
(fluopolymer), anti-enzyme GST/enzyme GST-mutant enzyme-PTPl 
linking construct, immunosorbent protein A, and scintillation agent. 
10 In general, the complex requires a solid support (beads, 

immunoassay column of e.g., AI2O3, or silica gel) to which the 

mutant enzyme can be anchored or tethered by attachment through a 
suitable linker, e.g., an immunosorbent (e.g, Protein A, Protein G, 
anti-mouse, anti-rabbit, anti-sheep) and a linking assembly, 
15 including an enzyme/anti-enzyme construct attached to the solid 
support. 

The term "cysteine-containing wild-type enzyme", as 
used herein, includes all native or natural enzymes, e.g., 
phosphatases, cysteine proteases, which contain cysteine in the 

20 active site as the active nucleophile, or contain cysteine clearly 

associated with the active site that is important in binding activity. 

The term "binding agent" as used herein includes all 
ligands (compounds) which are known to be able to bind with the 
wild-type enzyme and usually act as enzyme inhibitors. The binding 

25 agent carries a signal producing agent , e.g., radionuclide, to initiate 
the measurable signal. In the SPA assay the binding agent is a 
radioligand. 

The term "measurable signal" as used herein includes 
any type of generated signal, e.g., radioactive, colorimetric, 

30 photometric, spectrophotometry, scintillation, which is produced 

when binding of the radioligand binding agent to the mutant enzyme. 

The present invention assay further overcomes problems 
encountered in the past, where compounds were evaluated by their 
ability to affect the reaction rate of the enzyme in the phosphatase 

35 activity assay. However this did not give direct evidence that 

compounds were actually binding at the active site of the enzyme. 
The herein described invention binding assay using a substrate 
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analog can determine directly whether the mixtures of natural 
products can irreversibly modify the active site cysteine in the target 
enzyme resulting in inhibition of the enzymatic activity. To overcome 
inhibition by these contaminates in the phosphatase assay, a mutated 
5 Cys(215) to Ser(215) form of the tyrosine phosphatase PTP1B was 

cloned and expressed resulting in a catalytically inactive enzyme. In 
general, replacement of cysteine by serine will lead to a catalytically 
inactive or substantially reduced activity mutant enzyme. 

10 PTP1B is the first protein tyrosine phosphatase to be 

purified to near homogeneity (Tonks et al. JBC 263, 6731-6737 (1988)} 
and sequenced by Charbonneau et al. PNAS 85, 7182-7186 (1988). The 
sequence of the enzyme showed substantial homology to a duplicated 
domain of an abundant protein present in hematopoietic cells 

15 variously referred to as LCA or CD45. This protein was shown to 
possess tyrosine phosphatase activity {Tonks et al. Biochemistry 27, 
8695-8701 (1988)}. Protein tyrosine phosphatases have been known to 
be sensitive to thiol oxidizing agents and alignment of the sequence of 
PTP1B with subsequently cloned Drosophila and mammalian 

20 tyrosine phosphatases pointed to the conservation of a Cysteine 

residue {(M. Strueli et al Proc. Natl Acad USA, Vol. 86, pp. 8698-7602 
(1989)} which when mutated to Ser inactivated the catalytic activity of 
the enzymes. Guan et a/.(1991) {J.B.C. Vol. 266, 17926-17030, 1991} 
cloned the rat homologue of PTP1B, expressed a truncated version of 

25 the protein in bacteria, purified and showed the Cys at position 215 is 
the active site residue. Mutation of the Cys^lS to Ser215 resulted in 
loss of catalytic activity. Human PTP1B was cloned by Chernoff et al. 
Proc. Natl. Acad. Sci. USA 87, 2735-2739 (1990). 

Work leading up to the development of the substrate 

30 analog BzN-EJJ-CONH2 for PTP1B was published by T. Burke et al 

Biochem, Biophys, Res. Comm. 205, pp. 129-134 (1994) with the 
synthesis of the hexamer peptide containing the phosphotyrosyl 
mimetic F2Pmp. We have incorporated the (F2Pmp) moiety (4- 

phosphono-(difluoromethyl)phenylalanyl) into various peptides that 
35 led to the discovery of BzN-EJJ-CONIl2, (where E is glutamic acid 

and J as used herein is the F2Pmp moiety) an active (5 nM) inhibitor 
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of FTP IB. This was subsequently tritiated giving the radioactive 
substrate analog required for the binding assay. 

The mutated enzyme, as the truncated version, 
containing amino acids 1-320 (see FIGURE 2), has been demonstrated 
5 to bind the substrate analog Bz-NEJJ-CONH2 with high affinity for 
the first time. The mutated enzyme is less sensitive to oxidizing 
agents than the wild-type enzyme and provides an opportunity to 
identify novel inhibitors for this family of enzymes. The use of a 
mutated enzyme to eliminate interfering contaminates during drug 
10 screening is not restricted to the tyrosine phosphatases and can be 
used for other enzyme binding assays as well. 

Other binding assays exist in the art in which the basic 
principle of this invention can be utilized, namely, using a mutant 
enzyme in which an important and reactive cysteine important for 
15 activity can modified to serine (or a less reactive amino acid) and 

render the enzyme more stable to cysteine modifying reagents, such 
as alkylating and oxidizing agents. These other ligand-binding 
assays include, for example, colorimetric and spectrophotometric 
assays, e.g. measurement of produced color or fluorescence, 

20 phosphorescence (e.g. ELISA, solid absorbant assays) and other 
radioimmunoassays in which short or long wave light radiation is 
produced, including ultraviolet and gamma radiation). 

Further, the scintillation proximity assay can also be 
practiced without the fluopolymer support beads (AMERSHAM) as 

25 illustrated in FIGURE 1. For example, Scintistrips® are 

commercially available (Wallac Oy, Finland) and can also be 
employed as the scintillant-containing solid support for the mutant 
enzyme complex as well as other solid supports which are 
conventional in the art. 

30 The invention assay described herein is applicable to a 

variety of cysteine-containing enzymes including protein 
phosphatases, proteases, lipases, hydrolases, and the like. 

The cysteine to serine transformation in the target 
enzyme can readily be accomplished by analogous use of the 

35 molecular cloning technique for Cys 215 to Ser 2 15 described in the 
below-cited reference by M. Strueli et al., for PTP1B and is hereby 
incorporated by reference for this particular purpose. 
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A particularly useful class of phosphatases is the 
tyrosine phosphatases since they are important in cell function. 
Examples of this class are: PTP1B, LCA, LAK, DLAR, DPTP(See 
Strueli et aL, below). Ligands discovered by this assay using, for 
5 example, PTP1B can be useful, for example, in the treatment of 
diabetes and immunosuppression. 

A useful species is PTP1B, described in Proc. Natl Acad 
USA, Vol. 86, pp. 8698-7602 by M. Strueli et aL and Proc. Nat'l Acad 
ScL USA, Vol 87, pp. 2735-2739 by J. Chernoffrf aL 
10 Another useful class of enzymes is the proteases, 

including cysteine proteases (thiol proteases), cathepsins and 
capsases. 

The cathepsin class of cysteine proteases is important 
since Cathepsin K (also termed Cathepsin 02, see Biol. Chem. Hoppe- 

15 Seyler, Vol. 376 pp. 379-384, June 1995 by D. Bromme et aL) is 
primarily expressed in human osteoclasts and therefore this 
invention assay is useful in the study and treatment of osteoporosis. 
See US Patent 5,501,969 (1996) to Human Genome Sciences for the 
sequence, cloning and isolation of Cathepsin K (02). See also J. Biol. 

20 Chem. Vol. 271, No. 21, pp. 12511-12516 (1996) by F. Drake et aL and 
BioL Chem. Hoppe-Seyler, Vol. 376, pp. 379-384(1985) by D. Bromme et 
aL, supra. 

Examples of the cathepsins include Cathepsin B, 
Cathepsin G, Cathepsin J, Cathepsin K(02), Cathesin L, Cathepsin 
25 M, Cathepsin S. 

The capsase family of cysteine proteases are other 
examples where the SPA technology and the use of mutated enzymes 
can be used to determine the ability of unknown compounds and 
mixtures of compounds to compete with a radioactive inhibitor of the 
30 enzyme. An active site mutant of Human Apopain CPP32 (capsase-3) 
has been prepared. The active site thiol mutated enzymes are less 
sensitive to oxidizing agents and provide an opportunity to identify 
novel inhibitors for this family of enzymes. 

Examples of the capsase family include: capsase- l(ICE), 
35 capsase-2 (ICH-1), capsase-3 (CPP32, human apopain, Yama), 

capsase-4(ICE re l-ll, TX, ICH-2), capsase-5(ICE re l-lll, TY), capsase- 
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6(Mch2), capsase-7(Mch3, ICE-LAP3, CMH-1), capsase-8(FLICE, 
MACH, Mch5), capsase-9 (ICE-LAP6, Mch6) and capsase-10(Mch4). 

Substitution of the cysteine by serine (or by any other 
amino acid which lowers the activity to oxidizing and alkylating 
5 agents, e.g., alanine) does not alter the binding ability of the mutant 
enzyme to natural ligands. The degree of binding, i.e., binding 
constant, may be increased or decreased. The catalytic activity of the 
mutant enzyme will, however, be substantially decreased or even 
completely eliminated. Thus, natural and synthetic ligands which 
10 bind to the natural wild-type enzyme will also bind to the mutant 
enzyme. 

Substitution by serine for cysteine also leads to the 
mutant enzyme which has the same quantitative binding ability as 
the natural enzyme but is significantly reduced in catalytically 
15 activity. Thus, this invention assay is actually measuring the true 
binding ability of the test ligand. 

The test ligand described herein is a new ligand 
potentially useful in drug screening purposes and its mode of action 
is to generally function as an inhibitor for the enzyme. 
20 The binding agent usually is a known ligand used as a 

control and is capable of binding to the natural wild-type enzyme and 
the mutant enzyme employed in the assay and is usually chosen as a 
known peptide inhibitor for the enzyme. 

The binding agent also contains a known signal- 
25 producing agent to cause or induce the signal in the assay and can be 
an agent inducing e.g., phosphorescence or fluorescence (ELISA), 
color reaction or a scintillation signal. 

In the instant embodiment, where the assay is a 
scintillation assay, the signal agent is a radionuclide, i.e., tritium, 
30 l!25 ? which induces the scintillant in the solid support to emit 
measurable light radiation, i.e., photon emission, which can be 
measured by using conventional scintillation and beta radiation 
counters. 

We have also discovered that introducing two or more 4- 
35 phosphonodifluoro methyl phenylalanine (F2Pmp) groups into a 

known binding agent greatly enhances the binding affinity of the 
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binding agent to the enzyme and improves its stability by rendering 
the resulting complex less susceptible to hydrolytic cleavage. 

A method for introducing one F2Pmp moiety into a 

ligand is known in the art and is described in detail in Biochem. 
5 Biophys. Res. Comm. Vol. 204, pp. 129-134 (1994) hereby incorporated 
by reference for this particular purpose. 

As a result of this technology we discovered a new class 
of ligands having extremely good binding affinity for PTP1B. These 
include: 

10 N-Benzoyl-L-glutamyl-[4-phosphono(difluoromethyl)]-L-phenyl- 
alanyl-[4-phosphono(difluoromethyl)]-L-phenylalanine amide, 
N-Acetyl-L-glutamyl-[4-phosphono(difluoromethyl)]-L-phenylalanyl- 
[4-phosphono(difluoromethyl)]-L-phenylalanine amide, 
L-Glutamyl-[4-phosphono(difluoromethyl)]-L-phenylalanyl-[4- 

15 phosphono(difluoromethyl)]-L-phenylalanine amide, 

L-Lysinyl-[4-phosphono(difluoromethyl)]-L-phenylalanyl-[4- 
phosphono(difluoromethyl)]-L-phenylalanine amide, 
L-Serinyl-[4-phosphonofdifluoromethyl)]-L-phenylalanyl-[4- 
phosphono(difluoromethyl)]-L-phenylalanine amide, 

20 L-Prolinyl-[4-phosphono(difluoromethyl)]-L-phenylalanyl-[4- 
phosphono(difluoromethyl)]-L-phenylalanine amide, and 
L-Isoleucinyl-[4-phosphono(difluoromethyl)]-L-phenylalanyl-t4- 
phosphono(difluoromethyl)]-L-phenylalanine amide. 

25 A useful ligand in the series is Bz-NEJJ-CONH2, whose chemical 
name is: N-Benzoyl-L-glutamyl-[4-phosphono(difluoro-methyl)]-L- 
phenylalanyl-[4-phosphono(difluoromethyl)]-L-phenyl-alanineamide, 
and its tritiated form, N-(3,5-Ditritio)benzoyl-L-glutamyl-[4- 
phosphono(difluoromethyl)]-L-phenylalanyl-[4-phosphono- 

30 (dilfuoromethyl)]-L-phenylalanineamide. 

Synthesis of both cold and hot ligands is described in the 

Examples. 

The following Examples are illustrative of carrying out 
the invention and should not be construed as being limitations on the 
35 scope or spirit of the instant invention. 
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EXAMPLES 

1. Preparation of PTP1B Truncate (Amino Acid Sequence from 1-320 
and Fused GST-PTP1B Construct 

An E. coli culture carrying a PET plasmid expressing 
5 the full length PTP1B protein was disclosed in J. Chernoff et al. Proc 
Natl. Acad. ScL USA , 87, pp. 2735-2739, (1990). This was modified to 
a truncated PTP1B enzyme complex containing the active site with 
amino acids 1-320 inclusive, by the following procedure: 

The full length human PTP-1B cDNA sequence 

10 (published in J. Chernoff et al., PNAS, USA, supra) cloned 

into a PET vector was obtained from Dr. Raymond Erickson (Harvard 
University). The PTP-1B cDNA sequence encoding amino acids 1-320 
(Seq. ID No. 1) was amplified by PCR using the full length sequence 
as template. The 5' primer used for the amplification included a 

15 Bam HI site at the 5' end and the 3' primer had an Eco RI site at the 
3' end. The amplified fragment was cloned into pCR2 (Invitrogen) 
and sequenced to insure that no sequence errors had been introduced 
by Taq polymerase during the amplification. This sequence was 
released from pCR2 by a Bam HI/Eco RI digest and the PTP-1B cDNA 

20 fragment ligated into the GST fusion vector pGEX-2T (Pharmacia) 
that had been digested with the same enzymes. The GST-PTP-1B 
fusion protein expressed in E. Coli has an active protein tyrosine 
phosphatase activity. This same 1-320 PTP-1B sequence (Seq. ID No. 
1) was then cloned into the expression vector pFLAG-2, where FLAG 

25 is the octa-peptide AspTyrLysAspAspAspAspLys. This was done by 
releasing the PTP-1B sequence from the pGEX-2T vector by Nco I/Eco 
RI digest, filling in the ends of this fragment by Klenow and blunt- 
end fixating into the blunted Eco RI site of pFLAG2. Site-directed 
mutagenesis was performed on pFLAG2-PTP-lB plasmid using the 

30 Chameleon (Stratagene) double-stranded mutagenesis kit from 
Stratagene, to replaced the active-site Cys-215 with serine. The 
mutagenesis was carried out essentially as described by the 
manufacturer and mutants identifed by DNA sequencing. The 
FLAG-FTP- IB Cys215Ser mutant (Seq. ID No. 7) was expressed, 

35 purified and found not to have any phosphatase activity. The GST- 
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PTP-1B Cys 215 Ser mutant was made using the mutated Cys 21,r> Ser 
sequence of PTP-1B already cloned into pFLAG2, as follows. The 
pFLAG2- PTP-1B Cys 215 Ser plasmid (Seq. ID No. 7) was digested 
with Sal I (3' end of PTP-1B sequence), filled in using Klenow 

5 polymerase (New England Biolabs), the enzymes were heat 

inactivated and the DNA redigested with Bgl II. The 500 bp 3' PTP-1B 
cDNA fragment which is released and contains the mutated active 
site was recovered. The pGEX-2T-PTP-lB plasmid was digested with 
Eco RI (3' end of PTP-1B sequence), filled in by Klenow, 

10 phenol/chloroform extracted and ethanol precipitated. This DNA 
was then digested with Bgl II, producing two DNA fragments a 500 
bp 3' PTP-1B cDNA fragment that contains the active site and a 5.5 Kb 
fragment containing the pGEX-2T vector plus the 5' end of PTP-1B. 
The 5.5 Kb pGEX-2T 5' PTP-1B fragment was recovered and ligated 

15 with the 500 bp Bgl II/Sal I fragment containing the mutated active 
site. The ligation was transformed into bacteria (type DH5a, G) and 
clones containing the mutated active site sequence identified by 
sequencing. The GST-PTP-1B Cys^lSSer mu tant was overexpressed, 
purified and found not to have any phosphatase activity. 

20 

2. Preparation of Tritiated Bz-NEJJ-CONH ? 

This compound can be prepared as outlined in Scheme 1, 
below, and by following the procedures: 

25 Synthesis of N-Benzoyl-L-glutamyl-[4-phosphono(difluoromethyl)]-L- 
phenylalanyl-[4-phosphono(difluoromethyl )]-L-phenylalanineamide 
(BzN-EJJ-CONH?) 

1.0 g of TentaGel® S RAM resin (RAPP polymer, ~ 0.2 
mmol/g) as represented by the shaded bead in Scheme 1, was treated 
30 with piperidine (3 mL) in DMF (5 mL) for 30 min. The resin 
(symbolized by the circular P, containing the remainder of the 
organic molecule except the amino group) was washed successively 
with DMF (3 x 10 mL) and CH2CI2 (10 mL) and air dried. A solution 

of DMF (5 mL), N°°-Fmoc-4-[diethylphosphono-(difluoromethyl)]-L- 
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phenylalanine (350 mg) , where Fmoc is 9-fluorenylmethoxycarbonyl, 
and O-CT-azabenzotriazol-l-yD-ljljS^-tetramethyluranium 
hexafluorphosphate, (acronym being HATU, 228 mg) was treated 
with diisopropyl-ethylamine (0.21 mL) and, after 15 min., was added 
5 to the resin in 3 mL of DMF. After 1 h, the resin was washed 

successively with DMF (3x10 mL) and CH2CI2 (10 mL) and air dried. 

The sequence was repeated two times, first using N°°-Fmoc-4- 
[diethylphosphono-(difluoromethyl)]-L-phenylalamine and then 
using N-Fmoc-L-glutamic acid gamma-£-butyl ester. After the final 
10 coupling, the resin bound tripeptide was treated with a mixture of 
piperidine (3 mL) in DMF (5mL) for 30 min. and was then washed 
successively with DMF (3x10 mL) and CH2CI2 (10 mL) and air dried. 

To a solution of benzoic acid (61 mg) and HATU (190 mg) 
in DMF (1 mL) was added diisopropylethylamine (0.17 mL) and, after 
15 15 min. the mixture was added to a portion of the resin prepared 
above (290 mg) in 1 mL DMF. After 90 min. the resin was washed 
successively with DMF (3 x 10 mL) and CH2CI2 (10 mL) and air dried. 

The resin was treated with 2 mL of a mixture of TFA: water (9:1) and 
0.05 mL of triisopropylsilane (TIPS-H) for 1 h. The resin was filtered 

20 off and the filtrate was diluted with water (2 mL) and concentrated in 
vacuo at 35°C. The residue was treated with 2.5 mL of a mixture of 
TFA:DMS:TMSOTf (5:3:1) and 0.05 mL of TIPS-H, and stirred at 25°C 
for 15 h. (TFA is trifluoroacetic acid, DMS is dimethyl sulfate, 
TMSOTf is trimethylsilyl trifluoromethanesulfonate). 

25 The desired tripeptide, the title compound, was purified 

by reverse phase HPLC (C18 column, 25 x 100 mm) using a mobile 
phase gradient from 0.2% TFA in water to 50/50 acetonitrile/0.2% 
TFA in water over 40 min. and monitoring at 230 nm. The fraction 
eluting at approximately 14.3 min. was collected, concentrated and 

30 lyophylized to yield the title compound as a white foam. 
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Synthesis of N-(H,r)-[)itritio)benzoyl-L-glutamyl-[4-phosphono(difluoro- 
methyl)]-L-phenylalanyl-[4-phosphono(dilfuoromethyl)]-L-phenyl- 
alanineamide 

The above procedure described for the preparation of 
5 BzN-EJJ-CONH2 was repeated, but substituting 3,5-dibromobenzoic 

acid for benzoic acid. After HPLC purification as before, except using 
a gradient over 30 min. and collecting the fraction at approximately 
18.3 min., the dibromo containing tripeptide was obtained as a white 
foam. 

10 A portion of this material (2 mg) was dissolved in 

methanol/triethylamine (0.5 mL, 4/1), 10% Pd-C (2 mg) was added, 
and the mixture stirred under an atmosphere of tritium gas for 24 h. 
The mixture was filtered through celite, washing with methanol and 
the filtrate was concentrated. The title compound was obtained after 

15 purification by semi-preparative HPLC using a C18 column and an 
isocratic mobile phase of acetonitrile/0.2% TFA in water (15:100). The 
fraction eluting at approximately 5 min. was collected and 
concentrated in vacuo. The title compound was dissolved in 10 mL of 
methanol/water (9:1) to provide a 0.1 mg/mL solution of specific 

20 activity 39.4 Ci/mmol. 
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SCHEME 1 




piperidine, DMF 



ay 



NH 5 



OMe 

TentaGel® S RAM polymer 

H0 2 C-^,NHFmoc 




PO(OEt) 2 



F F 



HATU, (/-Pr) 2 NEt, DMF 
2. piperidine, DMF 




PO(OEt) 2 



F F 



(EtO) 2 OP 




^J^f^ PO(OEt) 2 



Y^NH 2 



F F 
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SCHEME 1 CONT'D 




HATU, (APr) 2 NEt, DMF 
2. piperidine, DMF 




F F 
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SCHEME 1 CONT'D 



1. TFA-H 2 0 (9:1) 

2. TFA-DMS-TMSOTf-TIPSH 

3. HPLC purification 



4. forX = Br:T 2 (g), 10%Pd-C 
MeOH, Et 3 N; 
HPLC purification 




PO(OH) 2 



X = H or T 



By following the above described procedure for BzN-EJJ- 
CONH2, the following other peptide inhibitors were also similarly 
5 prepared: 

N-Benzoyl-L-glutamyl-[4-phosphono(difluoromethyl)]-L-phenyl- 
alanyl-[4-phosphono(difluoromethyl)]-L-phenylalanine amide, 
N-Acetyl-L-glutamyl-[4-phosphono(difluoromethyl)]-L-phenylalanyl~ 
[4-phosphono(difluoromethyl)]-L-phenylalanine amide, 

10 L-Glutamyl-[4-phosphono(difluoromethyl)]-L-phenylalanyl-[4- 
phosphono(difluoromethyl)]-L-phenylalanine amide, 
L-Lysinyl-[4-phosphono(difluoromethyl)]-L-phenylalanyl-[4- 
phosphono(difluoromethyl)]-L-phenylalanine amide, 
L-Serinyl-[4-phosphono(difluoromethyl)]-L-phenylalanyl-[4- 

15 phosphono(difluoromethyl)]-L-phenylalanine amide, 

L-Prolinyl-t4-phosphono(difluoromethyl)]-L-phenylalanyl-[4- 
phosphono(difluoromethyl)]-L-phenylalanine amide, and 
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L-Isoleucinyl-[4-phosphono(difluoromethyl)]-L-phenyIalanyl-[4- 
phosphono(difluoromethyl)]-L-phenylalanine amide. 



4. Phosphatase Assay Protocol 

5 

Materials: 

EDTA - ethylenediaminetetraacetic acid (Sigma) 

DMH - N,N'-dimethyl-N,N'-bis(mercaptoacetyl)- 
hydrazine (synthesis published in J. Org. Chem. 56, pp. 2332- 
10 2337,(1991) by R. Singh and G.M. Whitesides and can be substituted 
with DTT - dithiothreitol Bistris - 2,2-bis(hydroxymethyl)2,2\2"- 
nitrilotriethanoHSigma) Triton X-100 - octylphenolpolyfethylene- 
glycolether) 10 (Pierce) 

Antibody: Anti-glutathione S-transferase rabbit (H and 
15 L) fraction (Molecular Probes) 

Enzyme: Human recombinant PTP1B, containing 
amino acids 1-320, (Seq. ID No. 1) fused to GST enzyme (glutathione 
S-transferase) purified by affinity chromatography. Wild type (Seq. 
ID No. 1) contains active site cysteine(215), whereas mutant (Seq. ID 
20 No. 7) contains active site serine(215). 

Tritiated peptide: Bz-NEJJ-CONH2, Mwt. 808, empirical 
formula, C32H32T2O12P2F4 



25 



Stock Solutions 



(10X) Assay Buffer 



30 Prepare fresh daily: 



500 mM Bistris (Sigma), pH 6.2, 

MW=209.2 
20mM EDTA (GIBCO/BRL) 
Store at 4° C. 



Assay Buffer ( IX) 50 mM Bistris 

(room temp.) 2 mM EDTA 



35 



5mM DMH (MW=208) 



-20- 



WO 98/20024 



PC17CA97/00824 



Enzyme Dilution 

Buffer (keep on ice) 50 mM Bistris 



2 mM EDTA 

SmMDMH 

20% Glycerol (Sigma) 

0.01 mg/ml Triton X-100 (Pierce) 



Antibody Dilution 

Buffer (keep on ice) 50 mM Bistris 

10 2 mM EDTA 



IC50 Binding Assay Protocol: 

Compounds (ligands) which potentially inhibit the 
binding of a radioactive ligand to the specific phosphatase are 
15 screened in a 96-well plate format as follows: 

To each well is added the following solutions @ 25°C in 
the following chronological order: 

1. 110 \x\ of assay buffer. 
20 2. 10 fil. of 50 nM tritiated BzN-EJJ-CONH2 in assay 

buffer (IX) @ 25°C. 

3. 10 \xl. of testing compound in DMSO at 10 different 
concentrations in serial dilution (final DMSO, about 5% v/v) in 
duplicate @ 25°C. 

25 4. 10 (iL of 3.75 (ag/ml purified human recombinant 

GST-PTP1B in enzyme dilution buffer. 

5. The plate is shaken for 2 minutes. 

6. 10 |il. of 0.3 (ig/ml anti-glutathione S-transferase 
(anti-GST) rabbit IgG (Molecular Probes) diluted in antibody dilution 

30 buffer @ 25°C. 

7. The plate is shaken for 2 minutes. 

8. 50 nl. of protein A-PVT SPA beads (Amersham) @ 

25°C. 

9. The plate is shaken for 5 minutes. The binding 
35 signal is quantified on a Microbeta 96-well plate counter. 

10. The non-specific signal is defined as the enzyme- 
ligand binding in the absence of anti-GST antibody. 



-21- 



WO 98/20024 



PCT/CA97/0O824 



11. 100% binding activity is defined as the enzyme- 
ligand binding in the presence of anti-GST antibody, but in the 
absence of the testing ligands with the non-specific binding 
subtracted. 

5 12. Percentage of inhibition is calculated accordingly. 

13. IC50 value is approximated from the non-linear 

regression fit with the 4-parameter/multipie sites equation (described 
in: "Robust Statistics", New York, Wiley, by P.J. Huber (1981) and 
reported in nM units. 
10 14. Test ligands (compounds) with larger than 90% 

inhibition at 10 (iM are defined as actives. 

The following Table I illustrates typical assay results of 
examples of known compounds which competitively inhibit the 
1 5 binding of the binding agent, BzN-EJJ-CONH2. 
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Preparation of Cathepsin K(Q2) Mutant (CAT-K Mutant) 

Cathcpsin K is a prominent cysteine protease in human 

osteoclasts and is believed to play a key role in osteoclast-mediated 

bone resorption. Inhibitors of cathepsin K will be useful for the 

5 treatment of bone disorders (such as osteoporosis) where excessive 

bone resorption occurs. Cathepsin K is synthesized as a dormant 

1 15 

preproenzyme (Seq. ID No. 4). Both the pre-domain (Met -Ala ) and 

the prodomain (Leu^-Arg*^) must be removed for full catalytic 

115 329 

activity. The mature form of the protease (Ala -Met 1 * ) contains 

139 

10 the active site Cys residue (Cys 1 1 ). 

The mature form of cathepsin K is engineered for 
expression in bacteria and other recombinant systems as a Met 
Ala -Met construct by PCR-directed template modification of a 
clone that is identified. Epi tope-tagged variants are also generated: 

15 (Met[FLAG]Ala 115 -Met 329 and Met Ala 115 -Met 329 [FLAG]; where 
FLAG is the octa-peptide AspTyrLysAspAspAspAspLys). For the 
purpose of establishing a binding assay, several other constructs are 
generated including Met[FLAG]Ala 115 -[Cys 139 to Ser 139 ]-Met 329 and 
Met Ala 115 -[Cys 139 to Ser 139 ]-Met 329 [FLAG] (where the active site 

20 Cys is mutated to a Ser residue), and Met[FLAG]Ala 110 -[Cys 10 ^ to 
Ala 139 ]-Met 329 and Met Ala 115 -[Cys 139 to Ala 139 ]-Met 329 [FLAG] 
(where the active site Cys is mutated to an Ala residue). In all cases, 
the resulting re-engineered polypeptides can be used in a binding 
assay by tethering the mutated enzymes to SPA beads via specific 

25 anti-FLAG antibodies that are commercially available (IDI-KODAK). 
Other epitope tags, GST and other fusions can also be used for this 
purpose and binding assay formats other than SPA can also be used. 
Ligands based on the prefered substrate for cathepsin K (e.g. Ac-P2- 
Pl, Ac-P2-Pl~aldehydes, Ac-P2-Pl-ketones; where PI is an amino 

30 acid with a hydrophilic side chain, preferably Arg or Lys, and P2 is 
an amino acid with a small hydrophobic side chain, preferably Leu, 
Val or Phe) are suitable in their radiolabeled (tritiated) forms for 
SPA-based binding assays. Similar binding assays can also be 
established for other cathepsin family members. 

35 
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Preparation of Apopain (capsase-3) Mutant 

Apopain is the active form of a cysteine protease 
belonging to the capsase superfamily of ICE/CED-3 like enzymes. It 
is derived from a catalytically dormant proenzyme that contains both 
5 the 17 kDa large subunit (pl7) and 12 kDa (pl2) small subunit of the 
catalytically active enzyme within a 32 kDa proenzyme polypeptide 
(p32). Apopain is a key mediator in the effector mechanism of 
apoptotic cell death and modulators of the activity of this enzyme, or 
structurally-related isoforms, will be useful for the therapeutic 

10 treatment of diseases where inappropriate apoptosis is prominent, 
e.g., Alzheimer's disease. 

The method used for production of apopain involves 
folding of active enzyme from its constituent pl7 and pl2 subunits 
which are expressed separately in E. coli. The apopain pl7 subunit 

15 (Ser 29 -Asp 175 ) and P 12 subunit (Ser 176 -His 277 ) are engineered for 
expression as MetSer 29 -Asp 175 and MetSer 176 -His 277 constructs, 
respectively, by PCR-directed template modification. For the purpose 
of establishing a binding assay, several other constructs are 
generated, including a MetSer 29 -[Cys 163 to Ser 163 ]-Asp 175 large 

20 subunit and a Met 1 -[Cys 163 to Ser 163 ]-His 277 proenzyme. In the 
former case, the active site Cys residue in the large subunit (pl7) is 
replaced with a Ser residue by site-directed mutagenesis. This large 
subunit is then re-folded with the recombinant pl2 subunit to 
generate the mature form of the enzyme except with the active site 

25 Cys mutated to a Ser. In the latter case, the same Cys 163 to Ser 163 
mutation is made, except that the entire proenzyme is expressed. In 
both cases, the resulting re-engineered polypeptides can be used in a 
binding assay by tethering the mutated enzymes to SPA beads via 
specific antibodies that are generated to recognize apopain (antibodies 

30 against the prodomain, the large pl7 subunit, the small pl2 subunit 
and the entire pl7:pl2 active enzyme have been generated). Epitope 
tags or GST and other fusions could also be used for this purpose and 
binding assay formats other than SPA can also be used. 
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Ligands based on the prefered substrate for apopain (varients of 
AspGluValAsp), such as Ac- AspGluValAsp, Ac-AspGluValAsp- 
aldehydes, Ac-AspGluValAsp-ketones are suitable in their 
radiolabeled forms for SPA-based binding assays. Similar binding 
5 assays can also be established for other capsase family members. 



DESCRIPTION OF THE SEQUENCE LISTINGS 



SEQ ID NO. 1 is the top sense DNA strand of Figures 2A and 2B 
10 for the PTP1B tyrosine phosphatase enzyme. 

SEQ ID NO. 2 is the amino acid sequence of Figures 2A and 2B for 
the PTP1B tyrosine phosphatase enzyme. 

15 SEQ ID NO. 3 is the top sense cDNA strand of Figures 3A, 3B and 
3C for the Cathepsin K preproenzyme. 

SEQ ID NO. 4 is the amino acid sequence of Figures 3 A, 3B and 3C 
for the Cathepsin K preproenzyme. 

20 

SEQ ID NO. 5 is the top sense cDNA strand of Figures 4 A and 4B 
for the CPP32 apopain proenzyme. 

SEQ ID NO. 6 is the amino acid sequence of Figures 4A and 4B 
25 for the CPP32 apopain proenzyme. 

SEQ ID NO. 7 is the cDNA sequence of the human PTP-lBi-320 
Ser mutant. 

30 SEQ ID NO. 8 is the amino acid sequence of the human 
PTP- IB 1-320 Ser mutant. 

SEQ ID NO. 9 is the cDNA sequence for apopain C163S mutant. 

35 SEQ ID NO. 10 is the amino acid sequence for the apopain C163S 

mutant. 
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SEQ ID NO. 11 is the large subunit of the heterodimeric amino and 
sequence for the apopain C163S mutant. 

SEQ ID NO. 12 is the cDNA sequence for the Cathepsin K C139S 
5 mutant. 

SEQ ID NO. 13 is the cDNA sequence for the Cathepsin K C139A 
mutant. 

10 SEQ ID NO. 14 is the amino acid sequence for the Cathepsin K 
C139S mutant. 

SEQ ID NO. 15 is the amino acid sequence for the Cathepsin K 
C139A mutant. 

15 
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SEQUENCE LISTING 
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(i) APPLICANT: Desmarais, Sylvie 
Friesen, Richard 
Zamboni, Richard 

(ii) TITLE OF INVENTION: NEW LIGANDS FOR PHOSPHATASE BINDING ASSAY 



(iii) NUMBER OF SEQUENCES: 15 

(iv) CORRESPONDENCE ADDRESS: 

(A) ADDRESSEE: ROBERT J. NORTH - MERCK & CO . , INC, 

(B) STREET: 126 EAST LINCOLN AVENUE - P.O. BOX 2000 

(C) CITY: RAHWAY 

(D) STATE: NEW JERSEY 

(E) COUNTRY: USA 

(F) ZIP: 07065 

(v) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk 

(B) COMPUTER: IBM PC compatible 

(Ci OPERATING SYSTEM : PC-DOS/MS-DOS 

(D I SOFTWARE : FastSEQ for Windows Version 2.0 

(vi) CURRENT APPLICATION DATA: 

(A) APPLICATION NUMBER: US unknown 

(B) FILING DATE: 04-NOV-1996 

(C) CLASSIFICATION: 

(viii) ATTORNEY /AGENT INFORMATION: 

(A) NAME: North, Robert J. 

(B) REGISTRATION NUMBER; 27,3 66 

(C) REFERENCE / DOCKET NUMBER: 19 840 PCT 

(ix) TELECOMMUNICATION INFORMATION: 

(A) TELEPHONE: 732-594-7262 

(B) TELEFAX: 732-594-4720 



(2) INFORMATION FOR SEQ ID NO : 1 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 963 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

( i i ) MOLECULE TYPE : cDNA 
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,x;> SEQUENCE DESCRIPTION: SEQ ID NO : 1 : 

ATGGAGA f ?:;G AAAAGGAGTT CGAGCAGATC GACAAGTCCG GGAGCTGGGC GGCCATTTAC 6 0 

CAGGATATCC GACATGAA3C CAGTGACTT2 CCATG TAGAG TGGCCAAGCT TCCTAAGAAC 120 

AAAAACCGAA ATAGGTACAG AGACGTCAGT CCCTITGACC ATAGTCGGAT TAAACTACAT 180 

CAAGAAGATA ATGACTATAT CAAZGCTAGT TTGATAAAAA TGGAAGAAGC CC AAAGGAGT 24 0 

TACATTCTTA CCCAGGGCCC TTTGCCTAAC ACATG "OGTC ACTTTTGG*GA GATGGTGTGG 3 00 

gagcagaaaa gcaggggtgt CGTCATGCTC AACAGAGTGA TGGAGAAAGG TTCGTTAAAA 3 60 

tgcgcacaat actggccaca aaaagaagaa aaagagatga tgtttgaaga cacaaatttg 420 

aaattaacat tgatctctga agatatcaag tcatattata cagtgggaca gctagaattg 480 

gaaaacctta caacccaaga aactcgagag atcttacatt tccactatac cacatggcct 54 0 

gactttggag tccctgaatc aocagcgtca ttcttgaact ttcttttcaa agtccgagag 600 

TCAGGGTCAC TCAGCCCGGA GCACGGGCCC GTTGTGGTGC ACTGCAGTGC AGGCATCGGC 660 

AGGTCTGGAA CCTTCTGTCT GGCTGATACC TGCCTCCTGC TGATGGACAA GAGGAAAGAC 72 0 

CCTTCTTCCG TTGATATCAA GAAAGTGCTG TTAGAAATGA GGAAGTTTCG GATGGGGTTG 7 80 

ATCCAGACAG CCGACCAGOT GCGCTTCTCC TACCTGGCTG TGATCGAAGG TGCCAAATTC 840 

ATCATGGGGG ACTCTTCCGT GCAGGATCAG TGGAAGGAGC TTTCCCACGA GGACCTGGAG 90 0 

CCCCCACCCG AGCATATCCC CCCACCTCCC CGGCCACCCA AACGAATCCT GGAGCCACAC 96 0 

TGA 963 

(2) INFORMATION FOR SEQ ID NO : 2 : 

( i ) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 320 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(n) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 2 : 

Met Glu Met Glu Lys Glu Phe Glu Gin lie Asp Lys Ser Gly Ser Trp 

1 -» 10 lb 

Ala Ala He Tyr Gin Asp He Arg His Glu Ala Ser Asp Phe Pro Cys 

25 30 
Arg Val Ala Lys Leu Pre Lys Asn Lys Asn Arg Asn Arg Tyr Arc Asp 

35 40 45 

Val Ser Pro Phe Asp His Ser Arg He Lys Leu His Gin Glu Asp Asn 

50 55 60 

Asp Tyr He Asn Ala Ser Leu He Lys Met Glu Glu Ala Gin Arg Ser 
65 70 75 ' 80 

Tyr He Leu Thr Gin Gly Pro Leu Pro Asn Thr Cys Gly His Phe Trp 

85 9 0 95 

Glu Met Val Trp Glu Gin Lys Ser Arg Gly Val Val Met Leu Asn Arg 
100 1 05 HO 
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Val Met Glu Lys Gly Ser Leu Lys Cys Ala Gin Tyr Trp Pro Gin Lys 

115 120 125 

Glu Glu Lys Glu Met. lie Phe Glu Asp Thr Asn Leu Lys Leu Thr Leu 

130 135 140 

He Ger Glu Asp He Lys Ser Tyr Tyr Thr Val Arg Gin Leu Glu Leu 
145 150 155 160 

Glu Asn Leu Thr Thr Gin Glu Thr Arg Glu He Leu His Phe His Tyr 

165 170 175 

Thr Thr Trp Pro Asp Phe Gly Val Pro Glu Ser Pro Ala Ser Phe Leu 

180 185 190 

Asn Phe Leu Phe Lys Val Arg Glu Ser Gly Ser Leu Ser Pro Glu His 

195 200 205 

Gly Pro Val Val Val His Cys Ser Ala Gly He Gly Arg Ser Gly Thr 

210 215 220 

Phe Cys Leu Ala Asp Thr Cys Lou Leu Leu Met Asp Lys Arg Lys Asp 
225 230 235 240 

Pro Ser Ser Val Asp lie Lys Lys Val Leu Leu Glu Met Arg Lys Phe 

24 r > 250 255 

Arg Met Gly Leu He Gin Thr Ala Asp Gin Leu Arg Phe Ser Tyr Leu 

260 265 270 

Ala Val He Glu Gly Ala Lys Phe He Met Gly Asp Ser Ser Val Gin 

275 280 285 

Asp Gin Trp Lys Glu Leu Ser His Glu Asp Leu Glu Pro Pro Pro Glu 

290 295 300 

His He Pro Pro Pro Pro Arg Pro Pro Lys Arg He Leu Glu Pro His 
305 310 315 320 



(2) INFORMATION FOR SEQ ID NO : 3 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 16 69 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(x:) SEQUENCE DESCRIPTION: SEQ ID NO:3: 



GAAACAAGCA 


CTGGATTCCA 


TATCCCACTG 


CCAAAACCGC 


ATGGTTCAGA 


TTATCGCTAT 


60 


TGCAGCTTTC 


ATCATAATAC 


ACACCTTTGC 


TGCCGAAACG 


AAGCCAGACA 


ACAGATTTCC 


120 


ATCAGCAGGA 


TGTGGGGGCT 


CAAGGTTCTG 


CTGCTACCTG 


TGGTGAGCTT 


TGCTCTGTAC 


180 


CCTGAGGAGA 


TACTGGACAC 


CCACTGGGAG 


CTA r rGGAAGA 


AGACCCACAG 


GAAGCAATAT 


240 


AACAACAAGG 


TGGATGAAAT 


CTCTCGGCGT 


TTAATTTGGG 


AAAAAAACCT 


GAAGTATATT 


3 00 


TC CATC C ATA 


ACCTTGAGGC 


TTCTCTTGGT 


GTCCATACAT 


ATGAACTGGC 


TATGAACCAC 


360 


CTGGGGGACA 


TGACOAGTGA 


AGAGGTGGTT 


CAGAAGATGA 


CTGGACTCAA 


AGTACCCCTG 


420 


TCTCATTCCC 


GCAGTAAT3A 


CACCCTTTAT 


ATCCCAGAAT 


GGGAAGGTAG 


AGCCCCAGAC 


480 


TCTGTCGACT 


ATCGAAAGAA 


AGGATATGTT 


ACTCCTGTCA 


AAAATCAGGG 


TCAGTGTGGT 


540 


TCCTGTTGGG 


CTTTTAGCTC 


TGTGGGTGCC 


CTGGAGGGCC 


AACTCAAGAA 


GAAAACTGGC 


600 


AAACTCTTAA 


ATCTGAGTCC 


CCAGAACCTA 


GTGGATTGTG 


TGTCTGAGAA 


TGATGGCTGT 


660 


GGAGGGGGCT 


ACATGACCAA 


TGCCTTCCAA 


TATJTGCAGA 


AGAACCGGGG 


TATTGACTCT 


720 
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;aacatgcct a_ccatatgt gggacaggaa gagagttgta tgtacaasc: aacaggsaag 730 
'agctaaat g ..'agagggta cagagagatc cc jgagggga atgagaaag octgaagagg 840 
■agtggg'-.'c gagtggjacc tgtctc'igtg gccattgatg caagcgtgac ctccttccag 90 0 

TTTTACAGGA AAGGTGTGTA TTATCATGAA AG:TGCAATA GCGATAATCT GAACCATGCG 96 0 

GTTTTGGCAG TGGGATATGG AATCCAGAAG GG.AAACAAGC ACTGGATAAT TAAAAACAGC 1020 

TGGGGAGAAA ACTGGGGAAA CAAAGGATAT ATCCTCATGG CTCGAAATAA GAACAACGCC 108 0 

T3TGGCATTG CCAACCTGGC CAGCTTCGCC AAGATGTGAC TCCAGCCAGC CAAATCCATC 114 0 

::TGCTCTTCC ATTTCTTCCA CGATGGTCCA GTGTAACGAT GCACTTTGGA AGGGAGTTGG 120 0 

TGTGCTATTT TTGAAGCAGA TGTGGTGATA CTGAGATTGT CTGTTCAGTT TCCCCATTTG 12 6 0 

TTTGTGCTTC AAATGAT :CT TGCTACTTTG CTTCTCTCCA CCCATGACCT TTTTCACTGT 1320 

GGCCATCAGG ACTTTCCCTG ACAGCTGTGT ACTCTTAGGC TAAGAGATGT GACTACAGCC 13 80 

TGCOCCTGAC TGTGTTGTCC CAGGGCTGAT GCTGTACAGG TACAGGCTGG AGATTTTCAC 1440 

ATAGGTTAGA TTCTCATTCA CGGGACTAGT TAGCTTTAAG CACCCTAGAG GACTAGGGTA 1500 

ATCTGACTTC TCACTTCCTA AGTTCCCTTC TATATCCTCA AGGTAGAAAT GTCTATGTTT 1560 

TCTACTCCAA TTGATAAATC TATTGATAAG TCTTTGGTAC AAGTTTACAT GATAAAAAGA 162 0 

AATGTGATTT GTGTTCCCTT CTTTGCACTT TTGAAATAAA GTATTTATC 1669 

(2) INFORMATION FOR SEQ ID 110:4: 

<i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH : 32^ amino acids 

(B) TYPE: amino acid 

( C ) STRAKDEDNES3 : s ingle 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 4 : 

Met Trp Gly Leu Lys Val Leu Leu Leu Pro Val Val Ser Phe Ala Leu 

1 ^ 10 15 

Tyr Pro Glu Glu He Leu Asp Thr His Trp Glu Leu Trp Lys Lys Thr 

20 25 3 0 

His Arg Lys Gin Tyr Asn Asn Lys Val Asp Glu He Ser Arg Ara Leu 

35 40 45 

He Trp Glu Lys Asn Leu Lys Tyr He Ser He His Asn Leu Glu Ala 

*0 55 60 

Ser Leu Gly Val His Thr Tyr Glu Leu Ala Met Asn His Leu Gly Asp 
6S 70 75 30 

Met Thr Ser Glu Glu Val Val Gin Lys Met Thr Gly Leu Lys Val Pro 

85 90 95 

Leu Sei His Ser Arg Ser Asn Asp Thr Leu Tyr He Pro Glu Trp Glu 

100 105 HO 

Gly Arg Ala Pre Asp Ser Val Asp Tyr Arg Lys Lys Gly Tyr Val Thr 

115 120 125 

Pro Val Lys Asn Gin Gly Gin Cys Gly Ser Cys Trp Ala Phe s^r Ser 

130 135 140 

Val Gly Ala Leu Glu Gly Gin Leu Lys Lys Lys Thr Gly Lys Leu Leu 
1^5 150 155 160 
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Asn 


Leu 


Ser 


Pro 


Gin 


Asn 


Leu 


Val 


Asp 


Cys 


Val 


Ser 


Glu 


Asn 


Asp 


Gly 










165 










170 










175 




Cys 


Gly 


Gly 


Gly 


Tyr 


Met 


Thr 


Asn 


Ala 


Phe 


Gin 


Tyr 


Val 


Gin 


Lys 


Asn 








180 










185 










190 






Arg 


Gly 


lie 


Asp 


Ser 


Glu 


Asp 


Ala 


Tyr 


Pro 


Tyr 


Val 


Gly 


Gin 


Glu 


Glu 






195 










200 










205 








Ser 


Cys 


Met 


Tyr 


Asn 


Pro 


Thr 


Gly 


Lys 


Ala 


Ala 


Lys 


Cys 


Arg 


Gly 


Tyr 




210 










215 










220 










Arg 


Glu 


He 


Pro 


Glu 


Gly 


Asn 


Glu 


Lys 


Ala 


Leu 


Lys 


Arg 


Ala 


Val 


Ala 


225 










230 










235 










240 


Arg 


Val 


Gly 


Pro 


Val 


Ser 


Val 


Ala 


He 


Asp 


Ala 


Ser 


Leu 


Thr 


Ser 


Phe 










245 










250 










255 




Gin 


Phe 


Tyr 


Ser 


Lys 


Gly 


Val 


Tyr 


Tyr 


Asp 


Glu 


Ser 


Cys 


Asn 


Ser 


Asp 








260 










265 










270 






Asn 


Leu 


Asn 


His 


Ala 


Val 


Leu 


Ala 


Val 


Gly 


Tyr 


Gly 


He 


Gin 


Lys 


Gly 






275 










280 










285 








Asn 


Lys 


His 


Trp 


He 


He 


Lys 


Asn 


Ser 


Trp 


Gly 


Glu 


Asn 


Trp Gly 


Asn 




290 










295 










300 










Lys 


Gly 


Tyr 


He 


Leu 


Met 


Ala 


Arg 


Asn 


Lys 


Asn 


Asn 


Ala 


Cys 


Gly 


He 


305 










310 










315 










320 


Ala 


Asn 


Leu 


Ala 


Ser 


Phe 


Pro 


Lys 


Met 

















(2) INFORMATION FOR SEQ ID NO : 5 : 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 1001 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(li) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION : SEQ ID NO : 5 : 



CTGCAGGAAT 


TCGGCACGAG 


GGG TGCTATT 


GTGAGGCGGT 


TGTAGAAGTT 


AATAAAGGTA 


60 


TCCATGGAGA 


ACACTGAAAA 


CTCAGTGGAT 


TCAAAATCCA 


TTAAAAATTT 


GGAACCAAAG 


120 


ATCATACATG 


GAAGCGAATC 


AATGGACTCT 


GG AATATCCC 


TGGACAACAG 


TTATAAAATG 


180 


GATTATCCTG 


AGATGGGTTT 


ATGTATAATA 


ATTAATAATA 


AGAATTTTCA 


TAAGAGCACT 


240 


GGAATGACAT 


CTCGGTCTGG 


TACAGATGTC 


GATGCAGCAA 


ACCTCAGGGA 


AACATTCAGA 


300 


AACTTGAAAT 


ATGAAGTCAG 


GAATAAAAAT 


GATCTTACAC 


GTGAAGAAAT 


TGTGGAATTG 


360 


ATGCGTGATG 


TTTCTAAAGA 


AGATCACAGC 


AAAAGGAGCA 


GTTTTGTTTG 


TGTGCTTCTG 


420 


AGCCATGGTG 


AAGAAGGAAT 


AATTTTTGGA 


ACAAATGGAC 


CTGTTGACCT 


GAAAAAAATA 


480 


ACAAACTTTT 


TCAGAGGGGA 


TCGTTGTAGA 


AGTCTAACTG 


GAAAACCCAA 


ACTTTTCATT 


540 


ATTCAGGCCT 


GCCGTGGTAC 


AGAACTGGAC 


TGTGGCATTG 


AG AC AG AC AG 


TGGTGTTGAT 


600 


GATGACATGG 


CGTGTCATAA 


AATACCAGTG 


GAGGCCGACT 


TCTTGTATGC 


ATACTCCACA 


660 


GCACCTGGTT 


ATTATTCTTG 


GCGAAATTCA 


AAGGATGGCT 


CCTGGTTCAT 


CCAGTCGCTT 


720 


TGTGCCATGC 


TGAAACAGTA 


TGCCGACAAG 


CTTGAATTTA 


TGCACATTCT 


TACCCGGGTT 


780 


AACCGAAAGG 


TGGCAACAGA 


ATTTGAGTCC 


TTTTCCTTTG 


ACGCTACTTT 


TCATGCAAAG 


840 
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AAACAGATTC CATGTATTGT TTCCATGCTC ACAAAAGAAC TCTATTTTTA TCACTAAAGA 9 00 

AATGGTTGGT TGGTGGTTTT 'I 'I TTAGTTTG TATGCCAAGT GAGAAGATGG TATATTTGGT 9 60 

agtgtatttc cctctcattt tgacctactc tcatgctgca G 1001 



(2) INFORMATION FOR SEQ ID NO : 6 : 

( i ) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 277 amino acids 

(B) TYPE: amino acid 

(C) STRANDED! JESS: single 

(D) TOPOLOGY: linear 



(li) MOLECULE TYPE: peptide 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 6 : 



Met 


G 1 u 


Asn 


Thr 


Glu 


Asn 


Ser 


' /a 1 


A sr. 


Ser 


Lys 


.^e r 


lit: 


T v.- 
i_i y o 


Asn 


Leu 


1 








5 










1 0 










1 D 




Glu 


Pro 


Ly s 


He 


He 


His 


Gly 


^er 


Glu 


Ser 


Met 


Asp 




vj ± y 


lie 


c>ei 








20 










25 
















Leu 


Asp 




oer 


Tyr 


Lys 


Met 


Asp 


Tyr 


Pro 


r-* i * . 
blu 


Met 


Gly 


Leu 


Cys 


He 






35 










40 










45 








He 


He 


Asn 


Asn 


Lys 


Asn 


Phe 


His 


Lys 


Ser 


Thr 


Gly 


Met 


Thr 


Ser 


Arg 




5 0 










55 










60 










Ser 


Gly 


Thr 


Asp 


Val 


Asp 


Ala 


Ala 


Asn 


Leu 


Arg 


Glu 


Thr 


Phe 


Arg 


Asn 


65 










70 










75 










80 


Leu 


Lys 


Tyr 


Glu 


Val 


Arg 


Asn 


Lys 


Asn 


Asp 


Leu 


Thr 


Arg 


Glu 


Glu 


He 










35 










90 










95 




Val 


Glu 


Leu 


Met 


Arg 


Asp- 


Val 


Ser 


Lys 


Glu 


Asp 


His 


Ser 


Lys 


Arg 


Ser 








100 










105 










110 






Ser 


Phe 


Val 


Cys 


Val 


Leu 


Leu 


Ser 


His 


Gly 


Glu 


Glu 


Gly 


He 


He 


Phe 






115 










120 










125 








Gly 


Thr 


Asn 


Gly 


Pro 


Val 


Asp 


Leu 


Lys 


Lys 


He 


Thr 


Asn 


Phe 


Phe 


Arg 




130 










135 










140 










Gly 


Asp 


Arg 


Cys 


Arg 


Ser 


Leu 


Thr 


Gly 


Lys 


Pro 


Lys 


Leu 


Phe 


He 


He 


145 










150 










155 










160 


Gin 


Ala 


Cys 


Arg 


Gly 


Thr 


Glu 


Leu 


Asp- 


Cys 


Gly 


He 


Glu 


Thr 


Asp 


Ser 










165 










170 










175 




Gly 


Val 


Asp 


Asp 


Asp 


Met 


Ala 


Cys 


His 


Lys 


He 


Pro 


Val 


Glu 


Ala 


Asp 








180 










185 










190 






Phe 


Leu 


Tyr 


Ala 


■*■ J 


Ser 


Thr 


Ala 


Pro 


Gly 


±x r 


Tyr 


Ser 


Trp 


Arg 


Asn 






195 










200 










205 








Ser 


Lys 


Asp 


Gly 


Ser 


Trp 


Phe 


He 


Gin 


Ser 


Leu 


Cys 


Ala 


Met 


Leu 


Lys 




210 










215 










220 










Gin 


Tyr 


Ala 


Asp 


Lys 


Leu 


Glu 


Phe 


Met 


His 


He 


Leu 


Thr 


Arg 


Val 


Asn 


225 










230 










235 










240 


Arg 


Lys 


Val 


Ala 


Thr 


Glu 


Phe 


Glu 


Ser 


Phe 


Ser 


Phe 


Asp 


Ala 


Thr 


Phe 










245 










250 










255 




His 


Ala 


Lys 


Lys 


Gin 


He 


Pro 


Cys 


He 


Val 


Ser 


Met 


Leu 


Thr 


Lys 


Glu 








260 










265 










270 






Leu 


Tyr 


Phe 


Tyr 


His 




























275 
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(2) INFORMATION FOR SEQ ID NO : 7 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 963 base pairs 

( B ) TYPE: nucleic acid 

(C) STRANDEDNESS : single 
<D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 7 : 

ATGGAGATGG AAAAGGAGTT CGAGCAGATC GACAAGTCCG GGAGCTGGGC GGCCATTTAC 60 

CAGGATATCC GACATGAAGC CAGTGACTTC CCATGTAGAG TGGCCAAGCT TCCTAAGAAC 12 0 

AAAAACCGAA ATAGGTACAG AGACGTCAGT CCCTTTGACC ATAGTCGGAT TAAACTACAT 180 

CAAGAAGATA ATGACTATAT CAACGCTAGT TTGATAAAAA TGGAAGAAGC CC AAAGGAGT 24 0 

TACATTCTTA CCCAGGGCCC TTTGCCTAAC ACATGCGGTC ACTTTTGGGA GATGGTGTGG 3 00 

GAGCAGAAAA GCAGGGGTGT CGTCATGCTC AACAGAGTGA TGGAGAAAGG TTCGTTAAAA 3 60 

TGCGCACAAT ACTGGCCACA AAAAGAAGAA AAAGAGATGA TCTTTGAAGA CACAAATTTG 42 0 

AAATTAACAT TGATCTCTGA AGATATCAAG TCATATTATA CAGTGCGACA GCTAGAATTG 480 

GACTTTGGAG TCCCTGAATC ACCAGCCTCA TTCTTGAACT TTCTTTTCAA AGTCCGAGAG 600 

TCAGGGTCAC TCAGCCCGGA GCACGGGCCC GTTGTGGTGC ACAGCAGTGC AGGCATCGGC 660 

AGGTCTGGAA CCTTCTGTCT GGCTGATACC TGCCTCCTGC TGATGGACAA GAGGAAAGAC 720 

CCTTCTTCCG TTGATATCAA GAAAGTGCTG TTAGAAATGA GGAAGTTTCG GATGGGGTTG 780 

ATCCAGACAG CCGACCAGCT GCGCTTCTCC TACCTGGCTG TGATCGAAGG TGCCAAATTC 840 

ATCATGGGGG ACTCTTCCGT GCAGGATCAG TGGAAGGAGC TTTCCCACGA GGACCTGGAG 900 

CCCCCACCCG AGCATATCCC CCCACCTCCC CGGCCACCCA AACGAATCCT GGAGCCACAC 9 60 

TGA 9 63 



(2) INFORMATION FOR SEQ ID NO : 8 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 322 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION : SEQ ID NO : 8 : 

Met Glu Met Glu Lys Glu Phe Glu Gin He Asp Lys Ser Gly Ser Trp 

15 10 15 

Ala Ala He Tyr Gin Asp He Arg His Glu Ala Ser Asp Phe Pro Cys 

20 25 30 

Arg Val Ala Lys Leu Pro Lys Asn Lys Asn Arg Asn Arg Tyr Arg Asp 
35 40 45 
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Val 


.ver 


Pro 


Phe 


Asp 


His 


Ser 


Arg 


lie 


Lys, 


Leu 


H l s 


Gin 


Glu 


Asp 


Asn 




50 










55 










60 








Asp 


■i vi- 


lie 


Asn 


Ala 


Ser 


Leu 


lie 


Lys 


Met 


Glu 


Glu 


Ala 


Gin 


Arg 


Ser 


65 










7 0 










75 








8 0 


Tyr 


ne 


Leu 


Thr 


Gin 


Gly 


Pro 


Leu 


Pro 


As n 


Thr 


Cys 


Gly 


His 


Phe 


Trp 










3 5 










90 










95 




Glu 


Met 


Val 


Trp 


Glu 


Gin 


Lys 


Ser 


Arq 


Gly 


Va 1 


Val 


Met 


Leu 


Asn 


Arg 








100 










105 










110 






Va 1 


Met 


Glu 


Lys 


Gly 


Sei- 


Leu 


Lys 


y ^ 


Ala 


Gin 


Tyr 


Trp 


Pro 


Gin 


Lys 






115 










120 










125 








Glu 


Glu 


Lys 


Glu 


Met 


Ile 


Phe 


Glu 


Asp 


Thr 


Asn 


Leu 


Lys 


Leu 


Thr 


Leu 




130 










135 










140 










He 


Ser 


Glu 


Asp 


lie 


Lys 


Ser 


Tyr 


Tyr 


Thr 


Val 


Arg 


Gin 


Leu 


Glu 


Leu 


145 










150 










155 










160 


Glu 


Asn 


Leu 


Thr 


Thr 


Gin 


Glu 


Thr 


Arg 


Glu 


Tie 


Leu 


His 


Phe 


His 


Tyr 










165 










170 










175 


Thr 


Thr 


Trp 


Pro 


Asp 


Phe 


Gly 


Val 


Pro 


Glu 


Ser 


Pro 


Ala 


Ser 


Phe 


Leu 








180 










185 










190 






Asn 


Phe 


Leu 


Phe 


Lys 


Val 


Arg 


Glu 


Ser 


Gly 


Ser 


Leu 


Ser 


Pro 


Glu 


His 






195 










200 










205 








Gly 


Pro 


Val 


Val 


Val 


His 


Ser 


Ser 


Ala 


Gly 


Tie 


Gly 


Thr 


Cys 


Gly 


Arg 




210 










215 










220 










Ser 


Gly 


Thr 


Phe 


Cys 


Leu 


Ala 


Asp 


Thr 


Cys 


Leu 


Leu 


Leu 


Met 


Asp 


Lys 


225 










23 0 










235 










240 


Arg 


Lys 


Asp 


Pro 


Ser 


Ser 


Val 


Asp 


Tie 


Lys 


Lys 


Val 


Leu 


Leu 


Glu 


Met 










245 










250 










255 




Arg 


Lys 


Phe 


Arg 


Met 


Gly 


Leu 


lie 


Gin 


Thr 


Ala 


Asp 


Gin 


Leu 


Arg 


Phe 








260 










265 










270 






Ser 


Tyr 


Leu 


Ala 


Val 


lie 


Glu 


Gly 


Ala 


Lys 


Phe 


lie 


Met 


Gly 


Asp 


Ser 






275 










280 










285 






Ser 


Val 


Gin 


Asp 


Gin 


Trp 


Lys 


Glu 


Leu 


Ser 


His 


Glu 


Asp 


Leu 


Glu 


Pre 




290 










295 










300 










Pro 


Pro 


Glu 


His 


lie 


Pro 


Pro 


Pro 


Pro 


Arg 


Pro 


Pro 


Lys 


Arg 


Tie 


Leu 


305 










310 










315 










320 



Glu Pro 



(2) INFORMATION FOR SEQ ID NO : 9 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1001 base pairs 

(B) TYPE: nucleic: acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(li) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 9 : 



CTGCAGGAAT 


TCGGCACGAG 


GGG TGCTATT 


GTGAGGCGGT 


TGTAGAAGTT 


AATAAAGGTA 


60 


TCCATGGAGA 


ACACTGAAAA 


CTCAGTGGAT 


TCAAAATCCA 


TTAAAAATTT 


GGAACCAAAG 


120 


ATCATACATG 


GAAGCGAATC 


AATGGACTCT 


GG AATATCCC 


TGGACAACAG 


TTATAAAATG 


180 


GATTATCCTG 


AGATGGGTTT 


ATGTATAATA 


ATTAATAATA 


AGAATTTTCA 


T AAG AG C ACT 


240 


GGAATGAOAT 


CTCGGTCTGG 


TACAGATGTC 


GATGCAGCAA 


ACCTCAGGGA 


AACATTCAGA 


300 


AACTTGAAAT 


ATGAAGTCAG 


GAATAAAAAT 


GATCTTACAC 


G TG AAG AAA T 


TGTGGAATTG 


360 
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ATGCGTGATG 


TTTCTAAAGA 


AGATCACAGC 


AAAAGGAGCA 


GTTTTGTTTG 


TGTGCTTCTG 


420 


AGCCATGGTG 


AAGAAGGAAT 


AATTTTTGGA 


ACAAATGGAC 


CTGTTGACOT 


GAAAAAAATA 


480 


ACAAACTTTT 


TCAGAGGGGA 


TCGTTGTAGA 


AGTCTAACTG 


GAAAACCCAA 


ACTTTTCATT 


540 


ATTCAGGCCT 


CCCGTGGTAC 


AGAACTGGAC 


TGTGGCATTG 


AG AC AG AC AG 


TGGTGTTGAT 


600 


GATGACATGG 


CGTGTCATAA 


AATACCAGTG 


GAGGCCGACT 


TCTTGTATGC 


ATACTCCACA 


660 


GCACCTGGTT 


ATTATTCTTG 


GCGAAATTCA 


AAGGATGGCT 


CCTGGTTCAT 


CCAGTCGCTT 


720 


TGTGCCATGC 


TGAAACAGTA 


TGCCG AC AAG 


CTTGAATTTA 


TGCACATTCT 


TACCCGGGTT 


780 


AACCGAAAGG 


TGGCAACAGA 


ATTTGAGTCC 


TTTTCCTTTG 


ACGCTACTTT 


TCATGCAAAG 


840 


AAACAGATTC 


CATGTATTGT 


TTCCATGCTC 


ACAAAAGAAC 


TCTATTTTTA 


TCACTAAAGA 


900 


AATGGTTGGT 


TGGTGGTTTT 


TTTTAGTTTG 


TATGCCAAGT 


GAGAAGATGG 


TATATTTGGT 


960 


ACTGTATTTC 


CCTCTCATTT 


TGACCTACTC 


TCATGCTGCA 


G 




1001 



(2) INFORMATION FOR SEQ ID NO: 10: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 277 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 



Met 


Glu 


Asn 


Thr 


Glu 


Asn 


Ser 


Val 


Asp 


Ser 


Lys 


Ser 


He 


Lys 


Asn 


Leu 


1 








5 










10 










15 




Glu 


Pro 


Lys 


He 


He 


His 


Gly 


Ser 


Glu 


Ser 


Met 


Asp 


Ser 


Gly 


He 


Ser 








20 










25 










30 






Leu 


Asp 


Asn 


Ser 


Tyr 


Lys 


Met 


Asp 


Tyr 


Pro 


Glu 


Met 


Gly 


Leu 


Cys 


He 






35 










40 










45 








He 


He 


Asn 


Asn 


Lys 


Asn 


Phe 


His 


Lys 


Ser 


Thr 


Gly 


Met 


Thr 


Ser 


Arg 




50 










55 










60 










Ser 


Gly 


Thr 


Asp 


Val 


Asp 


Ala 


Ala 


Asn 


Leu 


Arg 


Glu 


Thr 


Phe 


Arg 


Asn 


65 










70 










75 










80 


Leu 


Lys 


Tyr 


Glu 


Val 


Arg 


Asn 


Lys 


Asn 


Asp 


Leu 


Thr 


Arg 


Glu 


Glu 


He 










85 










90 










95 




Val 


Glu 


Leu 


Met 


Arg 


Asp 


Val 


Ser 


Lys 


Glu 


Asp 


His 


Ser 


Lys 


Arg 


Ser 








100 










105 










110 






Ser 


Phe 


Val 


Cys 


Val 


Leu 


Leu 


Ser 


His 


Gly 


Glu 


Glu 


Gly 


He 


He 


Phe 






115 










120 










125 








Gly 


Thr 


Asn 


Gly 


Pro 


Val 


Asp 


Leu 


Lys 


Lys 


He 


Thr 


Asn 


Phe 


Phe 


Arg 




130 










135 










140 










Gly 


Asp 


Arg 


Cys 


Arg 


Ser 


Leu 


Thr 


Gly 


Lys 


Pro 


Lys 


Leu 


Phe 


He 


He 


145 










150 










155 










160 


Gin 


Ala 


Ser 


Arg 


Gly 


Thr 


Glu 


Leu 


Asp 


Cys 


Gly 


He 


Glu 


Thr 


Asp 


Ser 










165 










170 










175 




Gly 


Val 


Asp 


Asp 


Asp 


Met 


Ala 


Cys 


His 


Lys 


lie 


Pro 


Val 


Glu 


Ala 


Asp 








180 










185 










190 






Phe 


Leu 


Tyr 


Ala 


Tyr 


Ser 


Thr 


Ala 


Pro 


Gly 


Tyr 


Tyr 


Ser 


Trp 


Arg 


Asn 






195 










200 










205 








Ser 


Lys 


Asp 


Gly 


Ser 


Trp 


Phe 


lie 


Gin 


Ser 


Leu 


Cys 


Ala 


Met 


Leu 


Lys 




210 










215 










220 
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Gin Tyr Ala Asp Lys Leu Glu Phe Met His He Leu Thr Ara Val Asn 
225 230 235 ' 240 

Arg Lys Val Ala Thr Glu Phe Glu .Ger Phe 3er The Asp Ala Thr Phe 

245 250 255 

His Ala Lys Lys Gin lie Pro Gys He Val Ser Met Leu Thr Lys Glu 

260 265 270 

Leu Tyr Phe Tyr His 
275 



(2) INFORMATION FOR SEQ ID NO: 11: 

(1) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 277 amino acids 
{ B ) TYPE: amine acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(li) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 1 1 : 



Met 


Glu 


Asn 


Thr 


Glu 


Asn 


Ser 


Val 


Asp 


Ser 


Lys 


Ser 


lie 


Lys 


Asn 


Leu 


1 








5 










10 










15 




Glu 


Pro 


Lys 


He 


He 


His 


Gly 


Ser 


Glu 


Ser 


Met 


Asp 


Ser 


Gly 


He 


Ser 








2 0 










25 










30 






Leu 


Asp 


Asn 


Ser 


Tyr 


Lys 


Met 


Asp 


Tyr 


Pro 


Glu 


Met 


Gly 


Leu 


Cys 


He 






35 










40 










45 








He 


He 


Asn 


Asn 


Lys 


Asn 


Phe 


His 


Lys 


Ser 


Thr 


Hy 


Met 


Thr 


Ser 


Arg 




50 










55 










60 










Ser 


Gly 


Thr 


Asp 


Val 


Asp 


Ala 


Ala 


Asn 


Leu 


Arg 


Glu 


Thr 


Phe 


Arg 


Asn 


65 










70 










75 










80 


Leu 


Lys 


Tyr 


Glu 


Val 


Arg 


Asn 


Lys 


Asn 


Asp 


Leu 


Thr 


Arg 


Glu 


Glu 


He 










35 










90 










95 




Val 


Glu 


Leu 


Met 


Arg 


Asp 


Val 


Ser 


Lys 


Glu 


Asp 


His 


Ser 


Lys 


Arg 


Ser 








100 










105 










110 






Ser 


Phe 


Val 


Cys 


Val 


Leu 


Leu 


Ser 


His 


Gly 


Glu 


Glu 


Gly 


He 


He 


Phe 






115 










120 










125 








Gly 


Thr 


Asn 


Gly 


Pro 


Val 


Asp 


Leu 


Lys 


Lys 


He 


Thr 


Asn 


Phe 


Phe 


Arg 




130 










135 










140 








Gly 


Asp 


Arg 


Cys 


Arg 


Ser 


Leu 


Thr 


Gly 


Lys 


Pro 


Lys 


Leu 


Phe 


He 


He 


145 










150 










155 










160 


Gin 


Ala 


Ser 


Arg 


Gly 


Thr 


Glu 


Leu 


Asp 


Cys 


Gly 


He 


Glu 


Thr 


Asp 


Ser 










165 










170 










175 




Gly 


Val 


Asp 


Asp 


Asp 


Met 


Ala 


Cy s 


His 


Lys 


He 


Pro 


Yd 1 


Glu 


Ala 


Asp 








180 










185 










190 






Phe 


Leu 


Tyr 


Ala 


Tyr 


Ser 


Thr 


Ala 


Pro 


Gly 


Tyr 


Tyr 


Ser 


Trp 


Arg 


Asn 






195 










200 










205 








Ser 


Lys 


Asp 


Gly 


Ser 


Trp 


Phe 


He 


Gin 


Ser 


Leu 


Cys 


Ala 


Met 


Leu 


Lys 




210 










215 










220 








Gin 


Tyr 


Ala 


Asp 


Lys 


Leu 


Glu 


Phe 


Met 


His 


He 


Leu 


Thr 


Arg 


Val 


Asn 


225 










230 










235 








240 


Arg 


Lys 


Val 


Ala 


Thr 


Glu 


Phe 


Glu 


Sei- 


Phe 


Ser 


Phe 


Asp 


Ala 


Thr 


Phe 










245 










250 










255 




His 


Ala 


Lys 


Lys 


Gin 


He 


Pro 


Cys 


Ile 


Val 


Ser 


Met 


Leu 


Thr 


Lys 


Glu 








2G0 










265 










270 






Leu 


- J - 


Phe 


Tyr 


Hi:; 

























275 
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(2) INFORMATION FOR SEQ ID NO: 12: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 990 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 



ATGTGGGGGC 


TCAAGGTTCT 


GCTGCTACCT 


GTGGTGAGCT 


TTGCTCTGTA 


CCCTGAGGAG 


60 


ATACTGGACA 


CCCACTGGGA 


GCTATGGAAG 


AAGACCCACA 


GG AAGCAATA 


TAACAACAAG 


12 0 


GTGGATGAAA 


TCTCTCGGCG 


TTTAATTTGG 


GAAAAAAACC 


TGAAGTATAT 


TTC CATC CAT 


180 


AACCTTGAGG 


CTTCTCTTGG 


TGTCCATACA 


TATGAACTGG 


CTATGAACCA 


CCTGGGGGAC 


240 


ATGACCAGTG 


AAGAGGTGGT 


TCAGAAGATG 


ACTGGACTCA 


AAGTACCCCT 


GTCTCATTCC 


300 


CGCAGTAATG 


ACACCCTTTA 


TATCCCAGAA 


TGGGAAGGTA 


GAGCCCCAGA 


CTCTGTCGAC 


360 


TATCGAAAGA 


AAGGATATGT 


TACTCCTGTC 


AAAAATCAGG 


GTCAGTGTGG 


TTCCTCTTGG 


420 


GCTTTTAGCT 


CTGTGGGTGC 


CCTGGAGGGC 


CAACTCAAGA 


AG A AAA C TGG 


CAAACTCTTA 


480 


AATCTGAGTC 


CCCAGAACCT 


AG'IGG AT T'GT 


GTGTCTGAGA 


ATGATGGCTG 


TGGAGGGGGC 


540 


TACATGACCA 


ATGCCTTCCA 


ATATGTGCAG 


AAGAACCGGG 


GTATTGACTC 


TGAAGATGCC 


600 


TACCCATATG 


TGGGACAGGA 


AGAGAGTTGT 


ATGTACAACC 


CAACAGGCAA 


GGCAGCTAAA 


660 


TGCAGAGGGT 


AC AG AG AG AT 


CCCCGAGGGG 


AATGAGAAAG 


CCCTG AAGAG 


GG CAGTGGCC 


720 


CGAGTGGGAC 


CTGTCTCTGT 


GGCCATTGAT 


GCAAGCCTGA 


CCTCCTTCCA 


GTTTTACAGC 


780 


AAAGGTGTGT 


ATTATGATGA 


AAGCTGCAAT 


AGCGATAATC 


TGAACCATGC 


GGTTTTGGCA 


840 


GTGGGATATG 


GAATCCAGAA 


GGG AAA C AAG 


CACTGGATAA 


TTAAAAAC AG 


CTGGGGAGAA 


900 


AACTGGGGAA 


ACAAAGGATA 


TATCCTCATG 


GCTCGAAATA 


AGAACAACGC 


CTGTGGCATT 


960 


GCCAACCTGG 


CCAGCTTCCC 


CAAGATGTGA 








990 



(2) INFORMATION FOR SEQ ID NO: 13: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 990 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13: 

ATGTGGGGGC TCAAGGTTCT GCTGCTACCT GTGGTGAGCT TTGCTCTGTA CCCTGAGGAG 60 
ATACTGGACA CCCACTGGGA GCTATGGAAG AAGACCCACA GG AAGCAATA TAACAACAAG 120 
GTGGATGAAA TCTCTCGGCG TTTAATTTGG GAAAAAAACC TGAAGTATAT TTCCATCCAT 180 
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AACCTTGAGG 




TGTCCATACA 


TATGAA'.'TGG 


'TATGAA '-'A 


cctgggggac 


240 


ATGACCAGTG 


AAGAGGTGGT 


TCAGAAGATG 


ACTGGACTCA 


AAGTACC :CT 


GTCTCATTCC 


300 


CGCAGTAATG 


acac:c;-ttta 


tat:ccagaa 


tc.;g> ;aa< ;c;ta 


GAGGG .7 AG A 


CTCTGTCGAC 


360 


TATCGAAAGA 


AAGGATATGT 


TACT I'CTGTC 


AAAAATCAGG 


3TCAGTGTGG 


TTCCGCTTG3 


420 


GCTTTTAGCT 


CTGTGGGTGC 


CCTGGAGGGC 


TAACTCAAGA 


agaaaa:tgg 


CAAACTCTTA 


480 


AAT rTGAGTC 


CCCAGAACCT 


AGTGGATTGT 


GTGTL'TGAGA 


ATGATGGCTG 


TGGAGGGGGC 


540 


TACATGACCA 


ATGCCTTCCA 


ATATGTGCAG 


AAGAAC CGGG 


GTATTGACTC 


TGAAGATGCC 


600 


TACCCATATG 


TGGGACAGGA 


AGAGAGTTGT 


ATGTACAACC 


CAACAGGCAA 


GGCAGCTAAA 


660 


TGCAGAGGGT 


ACAGAGAGAT 


CCCCGAGGGG 


AATGAGAAAG 


CCCTGAAGAG 


GGCAGTGGCC 


720 


CGAGTGGGAC 


CTGTCTCTGT 


GGCCATTGAT 


GCAAGCCTGA 


cctocttcca 


GTTTTACAGC 


780 


AAAGGTGTGT 


A r ITATGATGA 


AAGCTGCAAT 


AGCGATAATC 


TGAACCATGC 


GGTTTTGGCA 


840 


GTGGGATATG 


GAATCCAGAA 


GGG AAA CAAG 


CACTGGATAA 


TTAAAAACAG 


CTGGGGAGAA 


900 


AACTGGGGAA 


ACAAAGGATA 


TATCCTCATG 


GCTCGAAATA 


AGAACAACGC 


CTGTGGCATT 


960 


GCCAACCTGG 


CCAGCTTCCC 


CAAGATGTGA 








990 



(2) INFORMATION FOR SEQ ID NO: 14: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 329 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 1 4 : 



Met 


Trp 


Gly 


Leu 


Lys 


Val 


Leu 


Leu 


Leu 


Pro 


Val 


Val 


Ser 


Phe 


Ala 


Leu 


1 








5 










10 










15 




Tyr 


Pro 


Glu 


Glu 


He 


Leu 


Asp 


Thr 


His 


Trp 


Glu 


Leu 


Trp 


Lys 


Lys 


Thr 








20 










25 










30 






His 


Arg 


Lys 


Gin 


Tyr 


Asn 


Asn 


Lys 


Val 


Asp 


Glu 


lie 


Ser 


Arg 


Arg 


Leu 






3 5 










40 










45 








He 


Trp 


Glu 


Lys 


Asn 


Leu 


Lys 


Tyr 


He 


Ser 


He 


His 


Asn 


Leu 


Glu 


Ala 




50 










55 










60 










Ser 


Leu 


Gly 


✓ a i 


His 


Thr 


Tyr 


Glu 


Leu 


Ala 


Met 


Asn 


His 


Leu 


Gly 


Asp 


65 










70 










75 










80 


Met 


Thr 


Ser 


Glu 


Glu 


Val 


Val 


Gin 


Lys 


Met 


Thr 


Gly 


Leu 


Lys 


Val 


Pro 










85 










9 0 










95 




Leu 


Ser 


His 


Ser 


Arg 


Ser 


Asn 


Asp 


Thr 


Leu 


Tyr 


He 


Pro 


Glu 


Trp 


Glu 








100 










105 










110 






Gly 


Arg 


Ala 


Pro 


Asp 


Ser 


Val 


Asp 


Tyr 


Art; 


Lys 


Lys 


Gly 


Tyr 


Val 


Thr 






115 










120 










125 








Pro 


Val 


Lys 


Asn 


Gin 


Gly 


Gin 


Cys 


Gly 


Ser 


Ser 


Trp 


Ala 


Phe 


Ser 


Ser 




130 










135 










140 










Val 


Gly 


Ala 


Leu 


Glu 


Gly 


Gin 


Leu 


Lys 


Lys 


Lys 


Thr 


Gly 


Lys 


Leu 


Leu 


145 










150 










155 










160 


Asn 


Leu 


Ser 


Pro 


Gin 


Asn 


Leu 


Val 


Asp 


Cys 


Va 1 


Ser 


Glu 


Asn 


Asp 


Gly 



165 170 17 5 
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Cy s 


biy 


r 1 v 


C 1 M 


TV r 


Met 


Thr 


Asn 


Ala 


Phe 


Gin 


Tyr 


Val 


Gin 


Lys 


Asn 




180 










185 










190 






Arg 


vji ±y 


He 




Ser 


Glu 


Asp 


Ala 


Tyr 


Pro 


Tyr 


Val 


Gly 


Gin 


Glu 


Glu 


195 










200 










205 








Cor 


l. y 


Met 


r TVr 


Asn 


Pro 


Thr 


Gly 


Lys 


Ala 


Ala 


Lys 


Cys 


Arg 


Gly 


Tyr 




210 










215 










220 










Arg 


Glu 


He 


Pro 


Glu 


Gly 


Asn 


Glu 


Lys 


Ala 


Leu 


Lys 


Arg 


Ala 


Val 


Ala 


225 










230 










235 










240 


Arg 


Val 


Gly 


Pro 


Val 


Ser 


Val 


Ala 


He 


Asp 


Ala 


Ser 


Leu 


Thr 


Ser 


Phe 






245 










250 










255 




Gin 


Phe 


Tyr 


Ser 


Lys 


Gly 


Val 


Tyr 


Tyr 


Asp 


Glu 


Ser 


Cys 


Asn 


Ser 


Asp 






260 










265 










270 






Asn 


Leu 


Asn 


His 


Ala 


Val 


Leu 


Ala 


Val 


Gly 


Tyr 


Gly 


He 


Gin 


Lys 


Gly 






275 










280 










285 








Asn 


Lys 


His 


Trp 


He 


He 


Lys 


Asn 


Ser 


Trp 


Gly 


Glu 


Asn 


Trp 


Gly 


Asn 




290 










295 










300 










Lys 


Gly 


Tyr 


He 


Leu 


Met 


Ala 


Arg 


Asn 


Lys 


Asn 


Asn 


Ala 


Cys 


Gly 


He 


305 










310 










315 










320 


Ala 


Asn 


Leu 


Ala 


Ser 


Phe 


Pro 


Lys 


Met 

















325 



(2) INFORMATION FOR SEQ ID NO: 15: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 329 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15: 



Met 


Trp 


Gly 


Leu 


Lys 


Val 


Leu 


Leu 


Leu 


Pro 


Val 


Val 


Ser 


Phe 


Ala 


Leu 


1 






5 










10 










15 




Tyr 


Pro 


Glu 


Glu 


He 


Leu 


Asp 


Thr 


His 


Trp 


Glu 


Leu 


Trp 


Lys 


Lys 


Thr 






20 










25 










30 






His 


Arg 


Lys 


Gin 


Tyr 


Asn 


Asn 


Lys 


Val 


Asp 


Glu 


He 


Ser 


Arg 


Arg 


Leu 






35 










40 










45 








He 


Trp 


Glu 


Lys 


Asn 


Leu 


Lys 


Tyr 


He 


Ser 


He 


His 


Asn 


Leu 


Glu 


Ala 




50 










55 










60 










Ser 


Leu 


Gly 


Val 


His 


Thr 


Tyr 


Glu 


Leu 


Ala 


Met 


Asn 


His 


Leu 


Gly 


Asp 


65 








70 










75 










80 


Met 


Thr 


Ser 


Glu 


Glu 


Val 


Val 


Gin 


Lys 


Met 


Thr 


Gly 


Leu 


Lys 


Val 


Pro 










85 










90 










95 




Leu 


Ser 


His 


Ser 


Arg 


Ser 


Asn 


Asp 


Thr 


Leu 


Tyr 


He 


Pro 


Glu 


Trp 


Glu 








100 








105 










110 






Gly 


Arg 


Ala 


Pro 


Asp 


Ser 


Val 


Asp 


Tyr 


Arg 


Lys 


Lys 


Gly 


Tyr 


Val 


Thr 




115 










120 










125 








Pro 


Val 


Lys 


Asn 


Gin 


Gly 


Gin 


Cys 


Gly 


Ser 


Ala 


Trp 


Ala 


Phe 


Ser 


Ser 




130 










135 










140 










Val 


Gly 


Ala 


Leu 


Glu 


Gly 


Gin 


Leu 


Lys 


Lys 


Lys 


Thr 


Gly 


Lys 


Leu 


Leu 


145 








150 










155 










160 


Asn 


Leu 


Ser 


Pro 


Gin 


Asn 


Leu 


Val 


Asp 


Cys 


Val 


Ser 


Glu 


Asn 
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WHAT IS CLAIMED: 

1. A peptide comprising a ligand having binding affinity for 
a tyrosine phosphatase or cysteine protease, wherein said ligand contains two 
or more 4-phosphono(difIuoromethyl) phenylalanine groups. 

2. The peptide of Claim 1 wherein said ligand has a greater 
binding affinity than the corresponding ligand only containing one of said 4- 
phosphono( difluoromethyl) phenylalanine groups. 

3. A peptide selected from the group consisting of: 
N-Benzoyl-L-glutamyl-[4-phosphono(difluoromethyl)l-L-phenylalanyl-[4- 
phosphono(difluoromethyl)] L phenylalanineamide (BzN EJJ-CONH2), where 

E is glutamic acid and J is 4-phosphono(difluoro-methyl)]-L-phenylalanyl; 
N-BenzoyI L glutamyl- [4-phosphono(difluoromethyl)]-L-phenylalanyl- [4- 
phosphono(difluoromethyl)]-L-phenylalanine amide; 
N-Acetyl-L-glutamyl-[4-phosphono(difluoromethyl)]-L-phenylalanyl-[4- 
phosphono(difluoromethyl)]-L-phenylalanine amide; 

L-Glutamyl-[4-phosphono(difluoromethyl)]-L-phenylalanyl-[4-phosphono- 
(difluoromethyl )l-L-phenylalanine amide; 

L-Lysinyl-I4-phosphono(difluoromethyl)J-L-phenylalanyl-[4-phosphono- 
( difluoromethyl )]-L-phenylalanine amide; 

L-Serinyl-[4-phosphono(difluoromethyl)]-L-phenylalanyl-[4-phosphono- 
( difluoromethyl )]-L-phenylalanine amide; 

L-Prolinyl-[4-phosphono(difluoromethyl)J-L-phenylalanyl-[4-phosphono- 
( difluoromethyl )]-L-phenylalanine amide; and 

L-Isoleucinyl-[4-phosphono(difluoromethyl)]-L-phenyIaIanyl-[4-phosphono- 
( difluoromethyl )]-L-phenylalanine amide. 

4. The peptide of Claim 3 in tritiated or 1 1 25 iodinated form. 

5. A tritiated peptide, N-(3,5-Ditritio)benzoyl-L-glutamyl-[4- 
phosphono(difluoromethyl)]-L-phenylalanyl-f4-phosphono(difluoromethyl)]- 
L-phenylalanineamide. 
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6. A process for increasing the binding affinity of a ligand 
for a tyrosine phosphatase or cysteine protease comprising introducing into 
the ligand two or more 4-phosphono(difluoromethyl) phenylalanine groups. 
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ATGGAGATuGAAAA'jGAG ; i CGAGCAGA T CGACAAGTCCGGGAGC ICGGCGGCCAT TTAC 



1 -i + ..)__.. -i + + 60 

TACCTCTAC C TTT RTTCAAGCTCGTCTAGCTG 1 1 • AGGCCCK GACC C GCCGGTAAATG 
1 MetGl uMetGl uLysGl uF'heGl uGl n 1 1 -AspLysSerGl vSerTrpAl aAl a 1 1 eTyr 20 

CAGGATATCCGACATGAAGCCAGTGACTTCCCATGTAGA&TGGCCAAGCTTCCTAAGAAC 

4 4 4 4 + 4 IPO 

GTCCTATAGGCTGTACTTCG'jTCACTGAAbGGTACATCTCACCGGTTCGAAGGATTCTTG 
21 Gl nAspI 1 eArgHi . ,1 uAl aSerAspPhePr oCysArgVa 1 Al aLysLeuProLysAsn 40 

AAAMCCGAAATAGGTACAGAGAGjTCAGTCCCTTTGACCATAGTCGGATTAAACTACAT 

121 + + •* + - + + 180 

TTTTTGGCTTTAT CCATGTC ! CTGGAGTCAGGGMACTGGTAT CAGCC1 AATTTGATGTA 
41 LysAsnArqAsnArqTyrArgAspValSer ProPheAspHisSerArglleLysLeuHis 60 

CAAGAAGATAATGAC TATATCAACGCTA&T T 1 GAT AAAAATGGAAGAA&CCCAAAGGAGT 

If-tX -.-4 4 H 4 4 4 240 

GTTCTTCTATTACTGATATAGTTGCGATGAAACTATTTTTACCTICTTCGGGTTTCCTCA 
61 GlnGluAspAsnAsplyrl leAsnAlaSerleuIleLysMetGluGluAlaGlnArgSer 80 

TACATTCTTACCCAGGGCCCTTTGCCTAACACATGCGGTCACTTTTGGGAGATGGTGTGG 

241 4 4 4 4 4 4 300 

ATGTAAGAATGGGTCCCGGGAAACGGATTGTGTACGCCAGTGAAAACCCTCTACCACACC 
81 TyrlleLeuThrGlnGlyProLeuProAsnThrCysGlyHisPheTrpGluMetValTrp 100 

GAGCAGAAAAGCAGGGGTGTCGTCATGCTCAACAGAGTGATGGAGAAAGGTTCGTTAAAA 

301 4 4 --4 4--- 4--- 4 360 

ctcgtcttttcgtccccacagcagtacgagttgtctcactacctctttccaagcaatttt 

101 Gl uGl nLyf.SerArgGlyVal ValMetLeuAsnArgVa lMetGl uLysGlySerLeuLys 120 
TGCGCACAATACTGGCCACAAAAAGAAGAAAAAGAGATGATCTTTGAAGACACAAATTTG 

36! + + -,- 4 4 4 420 

ACGCGTGTTATGACCGGTGTTTTTCTTCTTTTTCTCTACTAGAAACTTCTGTGTTTAAAC 
121 CysAl aGl nTy rTrpProGl nLysGl uGluL.ysGl uMet 1 1 ePheGl uAspThrAsnLeu 140 

aaattaacattgatctctgaagatatcaagtcatattatacagtgcgacagctagaattg 

421 4 4 4 4 4 4 480 

tttaattgtaactagagacttctatagttcagtataatatgtcacgctgtcgatcttaac 

141 LysLeuThrLeuIleSerGluAspIleLy'-.SerTyrTyrThrValArgGlnLeuGluLeu 160 

gaaaaccttacaacccaagaaactcgagagatcttacatttccactataccacatggcct 

431 4 4 4 4 4- --+ 540 

cttttggaatgttgggttctttgagctctctagaatgtaaaggtgatatggtgtaccgga 

161 Gl uAsnLeuThrThrGl nGl uThrArgGl u 1 1 eLeuHi r.PheHi sTyrThrThrTrpPro 180 
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RAf.TTTGGAGTCCCTGAATCACCAGCCTCATTCTTGAACTTTCTTnCAAAGTCCGAGAG 

c l41 + + + — +--- + + 600 

CTGAAACCTCAGGGACTTAGTGGTCGGAGTAAGAACTTGAAAGAAAAGTTTCAGGCTCTC 
181 AspPheGlyValProGluSerProAlaSerPheLeuAsnPheLeuPheLysValArgGlu 200 



TCAGGGTCACTCAGCCCGGAGCACGGGCCCGTTGTGGTGCACTGCAGTGCAGGCATCGGC 
50i + + + + + + 660 

AGTCCCAGTGAGTCGGGCCTCGTGCCCGGGCAACACCACGTGACGTCACGTCCGTAGCCG 
201 SerGlySerLeuSerProGl uHi sGlyProVal Val Val Hi sCysSerAl aGly II eGly 220 

AGGTCTGGAACCTTCTGTCTGGCTGATACCTGCCTCCTGCTGATGGACAAGAGGAAAGAC 
551 + + + + + + 720 

TCCAGACCTTGGAAGACAGACCGACTATGGACGGAGGACGACTACCTGTTCTCCTTTCTG 
221 ArgSerGlyThrPheCysLeuAl aAspThrCysLeuLeuLeuMetAspLysArgLysAsp 240 

CCTTCTTCCGTTGATATCAAGAAAGTGCTGTTAGAAATGAGGAAGTTTCGGATGGGGTTG 

y21 + + + + + + 7 80 

GGAAGAAGGCAACTATAGTTCTTTCACGACAATCTTTACTCCTTCAAAGCCTACCCCAAC 
241 ProSerSerValAspIleLysLysValLeuLeuGluMetArgLysPheArgMetGlyLeu 260 

ATCCAGACAGCCGACCAGCTGCGCTTCTCCTACCTGGCTGTGATCGAAGGTGCCAAATTC 
781 + + + + + --+ 840 

TAGGTCTGTCGGCTGGTCGACGCGAAGAGGATGGACCGACACTAGCTTCCACGGTTTAAG 
261 I 1 eGl nThrAl aAspGl nLeuArgPheSerTyrLeuAl aVal I 1 eGl uGlyAl aLysPhe 

ATCATGGGGGACTCTTCCGTGCAGGATCAGTGGAAGGAGCTTTCCCACGAGGACCTGGAG 

841 + + + +- - + + 900 

TAGTACCCCCTGAGAAGGCACGTCCTAGTCACCTTCCTCGAAAGGGTGCTCCTGGACCTC 

1 1 eMetGl yAspSerSerVal Gl nAspGl nTrpLysGl uLeuSerHi sGl uAspLeuGl u 

CCCCCACCCGAGCATATCCCCCCACCTCCCCGGCCACCCAAACGAATCCTGGAGCCACACTGA 

90i + + + + + 9 60 

GGGGGTGGGCTCGTATAGGGGGGTGGAGGGGCCGGTGGGTTTGCTTAGGACCTCGGTGTGACT 
301 ProProProGl uHi s 1 1 eProProProProArgProProLysArg 1 1 eLeuGl uProHi sEnd 320 
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f:AAACAAr-rAf7(JGATT'"i:ATATn;CACT(V:C.AAMCCGCATGGTTCAGATIATCGCTAT 

I ----- <-■■ < 1 + " + 

CTTT'nfCGT3ACCTMG(iTATA<jQjT»jA'"-jGI JTTGGCGTACCAAGTCTAATAGCGATA 

TGCAGC TrTCATrATM[ACACA''CTTTGCrGCCGAAACGAAGCCAGACAACAGATTTCC 

61 + + + + + 

ACGTf.GAAAGTAGTATTATGTGT jGAAACGAGGGCTTTGCTTCGGTCTGTTGTCTAAAGG 

ATCAGCAGGATGTGGGGijCTCAAGGTTCTGCTGCTACCTGTGGTGAGCTTTGCTCTGTAC 

121 + + + + + + 

TAGTCGTCCTACACCCCCGAGTTCCAAGACGACGATGGACACCACTCGAAACGAGACATG 

MetTrpGlyLeuLy:>ValLeuLeuLeuProValValSerPheAlaLeuTyr 

rCTGAGGAGATACTGGACACCCACTGGGAGCTATGGAAGAAGACCCACAGGAAGCAATAT 

181 - + + + + --- + + 240 

GGAC IT CTCTATGACCTGTGGGTGACCCTCGATACCTTCTTCTGGGTGTCCTTCGTTATA 

ProGl uGluIl el euAspThrHi :,TrpGl uLeuTrpLysLysThrHi sArgLysGl nTyr 



TCTCATTCCCGCAGTAATGACACCCTTTATATCCCAGAATGGGAAGGTAGAGCCCCAGAC 

421 + + + + 4 + 

AGAGTMGGGCGTCATTACTGTGGGAAATATAGGGTCTTACCCTTCCATCTCGGGGTCTG 
SerHiSoerArgSerAinAspThrLeuTyrlleProGluTrpGluGlyArgAlaProAsp 

TCTGTCGACTATCGAAAGAAAGGATATGTTACTCCTGTCAAAAATCAGGGTCAGTGTGGT 

481 + + -+ + + + 

AGACAGCTGATAGCTTTCTTTCCfATACAATGAGGACAGTTTTTAGTCCCAGTCACACCA 

SerValAspTyrArgLysLysGlvTyrValThrProValLysAsnGlnGlyGlnCysGly 



60 



120 



180 



300 



360 



AACAACAAGGTGGATGAAATCTC I CGGCGTTTAATTTGGGAAAAAAACCTGAAGTATATT 

241 + + + + + + 

TTGT TCTTCCACCTAC7 1 TAGAGAGCCGCAAATTAAACCCTTTTTTTGGACTTCATATAA 
AsnAsnLysValAspGluIleSerArgArgLeuIleTrpGluLysAsnLeuLysTyrlle 

TCCATCCATAACCTTGAGGCTTCTCTTGGTGTCCATACATATGAACTGGCTATGAACCAC 

301 + + + + + + 

AGGTAGGTATTGGAACTCCGAAGAGAACCACAGGTATGTATACTTGACCGATACTTGGTG 
Ser 1 1 eHi sAsnl euG'l uAl aSerLeuGl yVa 1 Hi sThrTy rGl uLeuAl aMetAsnHi s 

CTGGGGGACATGACCAGTGAAGAGGTGGTTCAGAAGATGACTGGACTCAAAGTACCCCTG 

361 + + -- + + -+ + 420 

GACCCCCTGTACTGGTCACTTCTCCACCAAGTCTTCTACTGACCTGAGTTTCATGGGGAC 
LeuGlyAspMet ThrSirGl uGl uVal Val Gl nLysMetThrGlyLeuLysVal ProLeu 



480 
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TCCTGTTGGGCTTTTAGCTCTGTGGGTGCCCTGGAGGGCCAACTCAAGAAGAAAACTGGC 

_j_ 4- + + oUU 

54 AGGACMCCCGAAMTCGAGACAIXCACGGGACCTCCCGGTTGAGTTCTTCTTTTGACCG 
SerC^sTrpAlaPheSerSerValGlyAlaLeuGluGlyGlrleuLysLysLysThrGly 

" 139 

AMfTCTTAAATCTGAGTCCCCAGAACCTAGTGGATTGTGTGTCTGAGMTGATGGCTGT 

fini + + + + -- +--- + bbU 

TTTGAGAATTTAGACrCAGGGGTCTTGGATCACCTAACACACAGACTCTTACTACCGACA 

LysLeuLeuAsnLeuSerProGlnAsnLeuValAspCysValSerGluAsnAspGlyCys 
GGAGGGGGCTACATGACCAATGCCTTCCMTATGTGCAGAAGAACCGGGGTATTGACTCT 

rri + ___ + + --- + + + 

CCTCCCCCGATGTACTGGTTACGGAAGGTTATACACGTCTTCTTGGCCCCATAACTGAGA 
GlyGlyGlyTyrMetThrAsnAlaPheGlnTyrValGlnLysAsnArgGlylleAspSer 

GAAGATGCCTACCCATATGTGGGACAGGAAGAGAGTTGTATGTACAACCCAACAGGCAAG 

7 ?1 + + + + + + /Btj 

CTTCTACGGATGGGTATACACCCTGTCCTTCTCTCAACATACATGTTGGGTTGTCCGTTC 

Gl uAspAl aTy rProTy rVal Gl yGl nGl uGl uSerCysMetTy rAsnProThrGI y Lys 
GCAGCTAMTGCAGAGGGTACAGAGAGATCCCCGAGGGGAATGAGAAAGCCCTGAAGAGG ^ 

781 CGTCGATTTACGTCTCCCATGTCTCTCTAGGGGCTCCCCTTACTCTTTCGGGACTTCTCC 
Al aAl aLysCysArgGlyTyrArgGI ull eProGl uGlyAsnGl uLysAl aLeuLysArg 

GCAGTGGCCCGAGTGGGACCTGTCTCTGTGGCCATTGATGCAAGCCTGACCTCCTTCCAG 

R41 + + + + + + 900 

CGTCACCGGGCTCACCCTGGACAGAGACACCGGTAACTACGTTCGGACTGGAGGAAGGTC 

AlaValAlaArgValGlyProValSerValAlalleAspAlaSerLeuThrSerPheGln 

TTTTACAGCAAAGGTGTGTATTATGATGAAAGCTGCAATAGCGATAATCTGAACCATGCG 

am + ..- -+ + + + + ybU 

AAAATGTCGTTTCCACACATAATACTACTTTCGACGTTATCGCTATTAGACTTGGTACGC 

PheTyrSerLysGlyValTyrTyrAspGluSerCysAsnSerAspAsnLeuAsnHisAla 

GTTTTGGCAGTGGGATATGGAATCCAGAAGGGAAACAAGCACTGGATAATTAAAAACAGC ^ 

%1 CMMCCGTCACCCTATACCTTAGGTCTTCrXTTTGTTCGTGACCTATTMTTTnGTCG 
ValLeuAl aVal GlyTyrGlyll eGl nLysGlyAsnLysHi sTrpIlel leLysAsnSer 

TGGGGAGAAAACTGGGGAAACAAAGGATATATCCTCATGGCTCGAAATAAGAACAACGCC 

in ?i + + + + + + 108U 

ACfCCTCTTTTGACCCCTTTGTTTCCTATATAGGAGTACCGAGCTTTATTCTTGTTGCGG 

TrpGlyGluAsnTrpGlyAsnLysGlyTyrlleLeuMetAlaArgAsnLysAsnAsnAla 
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T3 R'jGCATTGOCAACC TGGTCA6C T rCCCCAAGATG T GACTCCAGC£ACCCAAATCCAT C 

, _o_ -f- 

lf)01 -f + + + 

ACAfrGTAACGGTTGGACCGGTCGAAGGGGTTCTACACTGAGGTCGGTCGGTTTAGGTAG 
CysG lylleAl aAsnLeuAl aSerPheProLysMetEnd 

CTH. raTrCATTTCTTCCACGATGGTGCAGTGTAACGATGCACTTTGGAAGGGAGTTGj 

1141 + + + + + + 

GACGAGAAGGTAAAGAAGGTGCTACCACGTCACATTGCTACGTGAAACCTTCCCTCAACC 



TGTr,fTATTTTTGAAGCAGATGTGGTGATACTGAGATTGTCTGTTCAGTTTCCCCATTTG 

' + + + + + + 1260 

ACAi'GATAAAAACT 1CGTCTACACCACTATGACTCTAACAGACAAGTCAAAGGGGTAAAC 



TT i GTGCTTCAAATGATCCTTCCTACTTTGCTTCTCTCCACCCATGACC I TTTTlACTb F 

r > 61 4-- - + + -- + + + 1320 

L " ' AMCACGAAGTTTACTAGGAAGGATGAAACGAAGAGAGGTGGGTACTGGAAAAAGTGACA 

GGl-CATCAGGACTTTCCCTGACAGCTGTGTACTCTTAGGCTAAGAGATGTGACTACAGCC 

,301 4 + + + + + 

" CCGGTAGTCCTGAAAGGGACTGTCGACACATGAGAATCCGATTCTCTACACTGATGTCGG 
TGiTCCTGACTGTGTTGTCCCAGGGCTGATGCTGTACAGGTACAGGCTGGAGATTTTCAC 

vm ..:'.:....+ + + + + + 1440 

^ ' ACGGGGACTGACACAACAGGGTCCCGACTACGACATGTCCATGTCCGACCTCTAAAAGTG 

ATAGGTTAGATTCTCATTCACGGGACTAGTTAGCTTTAAGCACCCTAGAGGACTAGGGTA 

14 41 + ---+- -+ + + + 1500 

TATCCAATCTAAGAGTAAGTGCCCTGATCAATCGAAATTCGTGGGATCTCCTGATCCCAT 

ATCTGACTTCTCACTTCCTAAGTTCCCTTCTATATCCTCAAGGTAGAAATGTCTATGTTT 

1501 --- + + + + -"-' + " ~~ + 1560 

TAGACTGAAGAGTGAAGGATTCAAGGGAAGATATAGGAGTTCCATCTTTACAGATACAAA 

TfTACTCCAATTCATAAATCTATTCATAAGTCTTTGGTACAAGTTTACATGATAAAAAGA 

i^! ' + + + + + + 1620 

AGATGAGGTTAAGTATTTAGATAAGTATTCAGAAACCATGTTCAAATG I ACTATTTTTCT 



AATGTGATTTGTCTTCCCTTCTTTGCACTTTTGAAATAAAGTATTTATC 

+ + + + 

TTACACTAAACAGAAGGGAAGAAACGTGAAAACTTTATTTCATAAATAG 
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CTGCAGGAATTCGGCACGAGGGGTGCTATTGTGAGGCGGTTGTAGAAGTTAATAAAGGTA 

+ + + + + + 

GACGTCCTTAAGCCGTGCTCCCCACGATAACACTCCGCCAACATCTTCAATTATTTCCAT 

TCCATGGAGAACACTGAAAACTCAGTGGATTCAAAATCCATTAAAAATTTGGAACCAAAG 



AGGTACCTCTTGTGACTTTTGAGTCACCTAAGTTTTAGGTAATTTTTAAACCTTGGTTTC 
MetGl uAsnThrGl uAsnSerVal AspSerLysSer 1 1 eLysAsnLeuGl uProLys 

ATCATACATGGAAGCGAATCAATGGACTCTGGAATATCCCTGGACAACAGTTATAAAATG 

+ + + + + + 

TAGTATGTACCTTCGCTTAGTTACCTGAGACCTTATAGGGACCTGTTGTCAATATTTTAC 
IlelleHi sGlySerGl uSerMetAspSerGlyl 1 eSerLeuAspAsnSerTyrLysMet 

GATTATCCTGAGATGGGTTTA I GTATAATAATTAATAATAAGAATTTTCATAAGAGCACT 

+ + + + + + 

CTAATAGGACTCTACCCAAATACATATTATTAATTATTATTCTTAAAAGTATTCTCGTGA 
AspTyrProGluMetGlyLeuCysIlellelleAsnAsnLysAsnPheHisLysSerThr 

GGAATGACATCTCGGTCTGGTACAGATGTCGATGCAGCAAACCTCAGGGAAACATTCAGA 

+ + + + + + 

CCTTACTGTAGAGCCAGACCATGTCTACAGCTACGTCGTTTGGAGTCCCTTTGTAAGTCT 
GlyMetThrSerArgSerGlyThrAspVal AspAl aAl aAsnLeuArgGl uThrPheArg 

AACTTGAAATATGAAGTCAGGAATAAAAATGATCTTACACGTGAAGAAATTGTGGAATTG 

+ -i --+-- + + + 

TTGAACTTTATACTTCAGTCCTTATTTTTACTAGAATGTGCACTTCTTTAACACCTTAAC 

AsnLeuLysTyrGl uVal ArgAsnLysAsnAspLeuThrArgGl uGluIl eVal Gl uLeu 

ATGCGTGATGTTTCTAAAGAAGATCACAGCAAAAGGAGCAGTTTTGTTTGTGTGCTTCTG 

+ + + + + + 

TACGCACTACAAAGATTTCTTCTAGTGTCGTTTTCCTCGTCAAAACAAACACACGAAGAC 

MetArgAspVal SerLysGl uAspHi sSerLysArgSerSerPheVal CysVal LeuLeu 

AGCCATGGTGAAGAAGGAATAATTTTTGGAACAAATGGACCTGTTGACCTGAAAAAAATA 

+ + + + + + 

TCGGTACCACTTCTTCCTTATTAAAAACCTTGTTTACCTGGACAACTGGACTTTTTTTAT 
SerHi sGlyGl uGluGlyllell ePheGlyThrAsnGlyProVal AspLeuLysLys II e 

ACAAACTTTTTCAGAGGGGATCGTTGTAGAAGTCTAACTGGAAAACCCAAACTTTTCATT 

+ + + + + + 

TGTTTGAAAAAGTCTCCCCTAGCAACATCTTCAGATTGACCTTTTGGGTTTGAAAAGTAA 

ThrAsnPhePheArgGlyAspArgCysArgSerLeuThrGlyLysProLysLeuPhelle 
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ATTCAGGCCTGCCGTGGTACAGAACTGGACTGTGGCArTijAGACAGACAGTGGTGTTGAT 

54! ■ -- - +- + --- + + + 500 

TAAGTCCGGACGGCACCATGTCTTGACCTGACACCGTAACTCTGTCTGTCACCACMCTA 
1 1 eGl nAl aCysArgGlyThrGl uLeuAspCysGly 1 1 eGl uThrAspS^rGly Va 1 Asp 
163 

GATGACATGGCGTGTCATAAAATACCAGTGGAGGCCGACTTCTTGTATGCATACTCCACA 
601 + + ■+ + + + 660 

CTACTGTACCGCACAGTATTTTATGGTCACCTCCGGCTGAAGAACATACGTATGAGGTGT 
AspAspMetAl aCysHi sLys 1 1 eProVal Gl uAl aAspPheLeuTyrAl aTyrSerThr 

GCACCTGGTTATTATTCTTGGCGAAATTCAAAGGATGGCTCCTGGTTCATCCAGTCGCTT 
661 + +-- -- + --+- -- + -- .+ 720 

CGTGGACCAATAATAAGAACCGCTTTAAGTTTCCTACCGAGGACCAAGTAGGTCAGCGAA 
AlaProGlyTyrTyrSerTrpArgAsnSerLysAspGlyScrTrpPhoIlcGlnSorLcu 

TGTGCCATGCTGAAACAGTATGCCGACAAGCTTGMTTTATGCACATTCTTACCCGGGTT 
/2l + + + + +... + 780 

ACACGGTACGACTTTGTCATACGGCTGTTCGAACTTAAATACGTGTAAGAATGGGCCCAA 
CysAl aMotLeuLysGl nTyrAl aAspLysLeuGl uPheMetHi s I 1 oLeuThr ArgVa 1 

AACCGAAAGGTGGCAACAGAATTTGAGTCCTTTTCCTTTGACGCTACTTTTCATGCAAAG 
781 + + + + + + 840 

TTGGCTTTCCACCGTTGTCTTAAACTCAGGAAAAGGAAACTGCGATGAAAAGTACGTTTC 
AsnArgLysVal Al aThrGl uPheGl uSerPheSerPheAspAl aThrPheHi sAl aLys 

AAACAGATTCCATGTATTGTTTCCATGCTCACAAAAGAACTCTATTTTTATCACTAAAGA 

841 -- -- + -- -+ + + + --- -+ 900 

TTTGTCTAAGGTACATAACAAAGGTACGAGTGTTTTCTTGAGATAAAAATAGTGATTTCT 
LysGlnll eProCys 1 1 eValSerMetLeuThrLysGl uLeuTyrPhcTyrHi sEnd 

AATGGTTGGTTGGTGGTTTTTTTTAGTTTGTATGCCAAGTGAGAAGATGGTATATTTGGT 
901 + + + + + + 950 

TTACCAACCAACCACCAAAAAAAATCAAACATACGGTTCACTCTTCTACCATATAAACCA 

ACTGTATTTCCCTCTCATTTTGACCTACTCTCATGCTGCAG 
961 + + + + . 1001 

TGACATAAAGGGAGAGTAAAACTGGATGAGAGTACGACGTC 
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